5/22/2016 Scalable Gaussian Processes for Characterizing Multidimensional Change Surfaces April 18, 2016 William Herlands Committee: Daniel Neill, Alex Smola, Wilbert Van Panhuis Chair: Dave Choi Outline 1. Motivation 2. Gaussian process introduction 3. Change surface model 4. Analysis of measles in the United States 1
5/22/2016 Complex Changes • In human systems changes are often complex – Policy interventions take time to trickle through government bureaucracy – Environmental hazards affect populations differentially • Simple changepoint models are not sufficiently expressive Why do we care? • Understand past changes – Explore spatio-temporal heterogeneity – Model the rate of changes in different areas • Enable more accurate or equitable policies • Applications – Measles incidence in the U.S – Concerns about lead-tainted 08 Jul 2014 21 Oct 2015 water in NYC 2
5/22/2016 Our objectives • Model complex changes in real world data – Multiple, flexible function Gaussian processes regimes for flexible functions – Non-discrete changes – Non-monotonic changes “Change surfaces” for complex changes – Heterogeneous changes over space, time, etc. Gaussian Processes (GP) • Non-parametric prior over smooth functions f ( x ) ~ GP ( m ( x ), k ( x , x ')) m ( x ) E [ f ( x )] k ( x , x ') cov( f ( x ), f ( x ')) • Covariance function is a kernel. Defines the covariance of function values 3
5/22/2016 Gaussian Processes (GP) • Any finite set of f( x ) is Normally distributed ~ N ( m ( x ), K ) f ( x 1 ),..., f ( x m ) • Observation model ~ N (0, ) y ( x ) f ( x ) , • Marginal log likelihood optimization log p ( y | ) log | K I | y T ( K I ) 1 y Full Model • Our model is a convex combination of f i __ __ __ __ y ( x ) s 1 ( x ) f 1 ( x ) ... s r ( x ) f r ( x ) n Switching functions Functional regimes i ( x ) r s 4
5/22/2016 Model part 1: Functional Regimes • GP prior for each functional regime – Use flexible stationary kernels f i ~ GP (0, K i ), i 1,..., r Model part 2: Change Surfaces • Changepoint 1 i I ( t T s i ) 0.5 0 −10 0 10 • Non-discrete changepoint 1 i softmax( t T s i ) 0.5 0 • Change surface −10 0 10 i softmax( w 1 s i ( t )) 0.5 i ( w i ( t )) s 0 −10 0 10 5
5/22/2016 Model part 2: Change Surfaces w i ( x ) • Random Kitchen Sink features for – Variable rate of change – Non-monotonic – Heterogeneous over input q w i ( x ) cos( j T x b j ) a j j 1 Full Model • Gaussian process change surface model r y ( x ) ( w i ( x )) f i ( x ) n ______ i 1 f i ( x ) ~ GP (0, K i ) • Can depict this as a single Gaussian process with covariance function r k all ( x , x ') ( w i ( x )) k i ( x , x ') ( w i ( x ')) i 1 6
5/22/2016 Scalable Inference • Log likelihood naively O(n 3 ) log p ( y | ) log | K I | y T ( K I ) 1 y • We develop scalable Kronecker inference using the Weyl bound, O(Dn D+1/D ) Measles in the United States • Data – Monthly incidence rates 1935 – 2003 – Continental United States and D.C. x 3 , 2D space and 1D time – – Measles vaccine introduced in 1963 7
5/22/2016 Measles in 3 states California 1 300 Incidence (1000s) CA 200 s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Maine 1 200 Incidence (1000s) ME s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Michigan 500 1 Incidence (1000s) 400 MI 300 s(w(x)) 0.5 200 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Measles in 3 states California 1 300 Incidence (1000s) CA 200 s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Maine 1 200 Incidence (1000s) ME s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Michigan 500 1 Incidence (1000s) 400 MI 300 s(w(x)) 0.5 200 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 8
5/22/2016 Measles in 3 states • GP change surface – 2 functional regimes w i ( x ) – as RKS with 5 features • Not a causal model! Measles in 3 states California 300 1 Incidence (1000s) CA 200 s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Maine 1 200 Incidence (1000s) ME s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Michigan 500 1 Incidence (1000s) 400 MI 300 s(w(x)) 0.5 200 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 9
5/22/2016 Measles in 3 states California 300 1 Incidence (1000s) CA 200 s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Maine 1 200 Incidence (1000s) ME s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Michigan 500 1 Incidence (1000s) 400 MI 300 s(w(x)) 0.5 200 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Measles in 3 states “Change slope” from σ(w(x)) = 0.25 0.75 . Michigan 500 1 Incidence (1000s) 400 “Change date” per state 300 s(w(x)) MI 0.5 σ(w(x)) = 0.5 200 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 10
5/22/2016 Change date for measles in U.S. 1961.5 1967.2 For each state, date where σ(w(x)) = 0.5 Change slope for measles in U.S. 0.156 0.297 For each state, slope of σ(w(x)) = 0.75 0.25 11
5/22/2016 Regression Analysis • Explore factors that affect the change date – Birth and death rates – Population numbers per age segment – Income information – Government hospital and health workers – Slope of change surface – Average temperature Demographic Analysis 12
5/22/2016 Regression Analysis • Gini of family income 1961.5 1967.2 – Economically depressed communities – Rural regions • Slope of change surface – Fewer cases nationwide enable more effective immunization later Conclusions • Introduced model for “change surfaces” in real world data • Developed scalable inference for additive, non-stationary Gaussian processes • Identified heterogeneity in first years of the measles vaccine • Used the results of the change surface model for policy relevant conclusions 13
5/22/2016 Acknowledgements • Committee – Daniel Neill, Alex Smola, Wilbert van Panhuis • Chair – Dave Choi • Collaborators* – Andrew Wilson – Seth Flaxman – Hannes Nickisch *Subset of paper accepted to AISTATS 2016 Questions? Fin. 28 14
5/22/2016 Backup slides Conclusions • Introduced model for “change surfaces” in real world data • Developed scalable inference for additive, non-stationary Gaussian processes • Identified heterogeneity in first years of the measles vaccine • Used the results of the change surface model for policy relevant conclusions 15
5/22/2016 Spectral Mixture Kernels Inference • Compute log marginal likelihood • General Kronecker methods for scalability – Assume: – Assume: multiplicative kernel across D – Then we can decompose kernel matrix, 16
5/22/2016 Inference • For additive kernels • K -1 can be computed efficiently using LCG* • But how can we compute the log|K| ? *See Flaxman et al. (2015) Inference 17
5/22/2016 Inference • Choosing indices i, j Method Complexity Minimization for best pair O(n 2 ) “Middle” heuristic i=j O(n) OR i=j+1 Greedy search of s pairs O(2sn) below and above previous pair Inference • Scaling functions, σ(w(x)) 18
5/22/2016 Inference 3 Kernels 3 Kernels 4 10 1 Weyl exact 0.8 Weyl middle Log determinant approximation ratio Weyl greedy 0.6 Cheb−Hutch 2 10 True log det 0.4 0.2 Time (sec) 0 10 0 −0.2 −0.4 −2 10 −0.6 −0.8 −4 −1 10 2 3 4 2 4 6 10 10 10 10 10 10 Observations (#) Observations (#) Inference – so what?! • Linear complexity for additive kernels – O(Dn D+1/D ) • Scalable inference for non-separable kernels in space and time • Scalable inference for non-stationary kernels 19
5/22/2016 Numerical Experiments • 2500 points of synthetic data • 2 functional regimes defined by squared exponential kernels • Change surface define by Results - Numerical 20
5/22/2016 Demographic Analysis Demographic Analysis 21
Recommend
More recommend