Spatial Autoregressive Models Sudipto Banerjee Division of - PowerPoint PPT Presentation

Spatial Autoregressive Models Sudipto Banerjee Division of Biostatistics, University of Minnesota September 22, 2009 1 Areal Modelling

Areal unit data Maps of raw standard mortality ratios (SMR) of lung and esophagus cancer between 1991 and 1998 in Minnesota counties (a) (c) Lung Cancer Esophagus Cancer 0 - 0.5 0 - 0.5 0.5 - 0.9 0.5 - 0.9 0.9 - 1.1 0.9 - 1.1 1.1 - 1.5 1.1 - 1.5 1.5 - 3.3 1.5 - 3.3 2 Areal Modelling

Areal unit data 3 Areal Modelling

Areal unit data 4 Areal Modelling

Areal unit data Key Issues Is there spatial pattern? Spatial pattern implies that observations from units closer to each other are more similar than those recorded in units farther away. Do we want to smooth the data? Perhaps to adjust for low population sizes (or sample sizes) in certain units? How much do we want to smooth? Inference for new areal units? Is prediction meaningful here? If we modify the areal units to new units (e.g. from zip codes to county values), what can we say about the new counts we expect for the latter given those for the former? This is the Modifiable Areal Unit Problem (MAUP) or Misalignment. 5 Areal Modelling

Proximity matrices W , entries w ij , ( w ii = 0 ); choices for w ij : w ij = 1 if i, j share a common boundary (possibly a common vertex) w ij is an inverse distance between units w ij = 1 if distance between units is ≤ K w ij = 1 for m nearest neighbors. W need not be symmetric. W : standardize row i by w i + = � � j w ij (row stochastic but need not be symmetric). W elements often called “weights”; nicer interpretation? 6 Areal Modelling

Proximity matrices Note that proximity matrices are user-defined. We can define distance intervals, (0 , d 1 ] , ( d 1 , d 2 ] , and so on. First order neighbours: all units within distance d 1 . First order proximity matrix W (1) . Analogous to W , w (1) ij = 1 if i and j are first order neighbors; 0 otherwise. Second order neighbors: all units within distance d 2 , but separated by more than d 1 . Second order proximity matrix W (2) ; w (2) ij = 1 if i and j are second order neighbors; 0 otherwise And so on... 7 Areal Modelling

Proximity matrices There are analogues for areal data of the empirical correlation function and the variogram. Moran’s I : analogue of lagged autocorrelation n � � j w ij ( Y i − ¯ Y )( Y j − ¯ Y ) i I = ( � i � = j w ij )( � i ( Y i − ¯ Y ) 2 I is not supported on [ − 1 , 1] . Geary’s C : analogue of Durbin-Watson statistic ( n − 1) � � j w ij ( Y i − Y j ) 2 i C = � i � = j w ij ) � i ( Y i − ¯ Y ) 2 Both are asymptotically normal if Y i are i.i.d., the first with mean − 1 / ( n − 1) and the second with mean 1 . Significance testing using a Monte Carlo test, permutation invariance 8 Areal Modelling

Proximity matrices The areal correlogram is a useful tool to study spatial association with areal data. Working with I , we can replace w ij with w (1) ij taken from W (1) and compute → I (1) ij taken from W (2) and compute Next replace w ij with w (2) → I (2) , etc. Plot I ( r ) vs. r If there is spatial pattern, we expect I ( r ) to decline in r initially and then vary about 0. 9 Areal Modelling

Proximity matrices 10 Areal Modelling

Spatial smoothers P i w ij Y j To smooth Y i , replace with ˆ Y i = Note: K -nearest w i + neighbours (KNN) regression falls within this framework. More generally, (1 − α ) Y i + α ˆ Y i Linear (convex) combination, shrinkage Model-based smoothing, e.g., E ( Y i |{ Y j , j = 1 , 2 , ..., n } ) 11 Areal Modelling

Markov Random Fields First, consider Y = ( y 1 , y 2 , ..., y n ) and consider the set { p ( y i | y j , j � = i ) } We know p ( y 1 , y 2 , ...y n ) determines { p ( y i | y j , j � = i ) } (full conditional distributions) ??? Does { p ( y i | y j , j � = i ) } determine p ( y 1 , y 2 , ...y n ) ? If so, we call the joint distribution a Markov Random Field . In general we cannot write down an arbitrary set of conditionals and assert that they determine the joint distribution. Example: Y 1 | Y 2 ∼ N ( α 0 + α 1 Y 2 , σ 2 1 ) Y 2 | Y 1 ∼ N ( β 0 + β 1 Y 3 1 , σ 2 2 ) . The first equation implies that E [ Y 1 ] = α 0 + α 1 E [ Y 2 ] , i.e., E [ Y 1 ] is linear in E [ Y 2 ] . The second equation implies that E [ Y 2 ] = β 0 + β 1 E [ Y 3 1 ] , i.e. E [ Y 2 ] is linear in E [ Y 3 1 ] . Clearly this isn’t true in general. Hence no joint distribution. 12 Areal Modelling

Markov Random Fields Also p ( y 1 , . . . , y n ) may be improper even if all the full conditionals are proper. p ( y 1 , y 2 ) ∝ exp { ( y 1 − y 2 ) 2 } But p ( Y 2 | Y 1 ) ∝ N ( Y 2 ) and p ( Y 1 | Y 2 ) ∝ N ( Y 2 , 1) . Yet the joint distribution is improper. Compatibility: Brook’s Lemma. Let y 0 = ( y 10 , . . . , y n 0 ) be any fixed point in the support of p ( · ) . p ( y 1 , . . . , y n ) = p ( y 1 | y 2 , . . . , y n ) p ( y 2 | y 10 , y 3 , . . . , y n ) p ( y 10 | y 2 , . . . , y n ) p ( y 20 | y 10 , y 3 , . . . , y n ) . . . p ( y n | y 10 , . . . , y n − 1 , 0 ) p ( y n 0 | y 10 , . . . , y n − 1 , 0 ) p ( y 10 , . . . , y n 0 ) . If LHS is proper, the fact that it integrates to 1 determines the normalizing constant! 13 Areal Modelling

Local specifications Suppose we want: p ( y i | y j , j � = i ) = p ( y i | y j ∈ ∂ i ) When does the set { p ( y i | y j ∈ ∂ i ) } uniquely determine p ( y 1 , y 2 , ...y n ) ? To answer this question, we need the following important concepts: Clique: A clique is a set of cells such that each element is a neighbor of every other element. We use notation i ∼ j if i is a neighbor of j and j is a neighbor of i . Potential: A potential of order k is a function of k arguments that is exchangeable in these arguments. The arguments of the potential would be the values taken by variables associated with the cells for a clique of size k . 14 Areal Modelling

Local specifications For clique size say 2, i ∼ j means j ∼ i ( ⇔ ( y i − y j ) 2 ) For continuous data: Q ( y i , y j ) = y i y j For binary data: Q ( y i , y j ) = I ( y i = y j ) = y i y j = (1 − y i )(1 − y j ) Cliques of size 1 ⇔ independence Cliques of size 2 ⇔ pairwise difference form     �  − 1 ( y i − y j ) 2 I ( i ∼ j ) p ( y 1 , y 2 , ...y n ) ∝ exp 2 τ 2  i,j and therefore p ( y i | y j , j � = i ) = N ( � j ∈ ∂ i y i /m i , τ 2 /m i ) , where m i is the number of neighbors of i 15 Areal Modelling

Local specifications Gibbs distribution: p ( y 1 , . . . , y n ) is a Gibbs distribution if it is a function of the y i ’s only through potentials on cliques: � � φ ( k ) ( y α 1 , . . . , y α k ) } , p ( y 1 , . . . , y n ) ∝ exp {− γ k α ∈ M k where φ ( k ) is a potential of order k , M k is the set of all cliques of size k and is indexed by α , and γ > 0 is a scale parameter. Hammersley-Clifford Theorem: If we have a Markov Random Field (i.e., { p ( y i | y j ∈ ∂ i ) } uniquely determine p ( y 1 , y 2 , ...y n ) ), then the latter is a Gibbs distribution Geman and Geman (1984) result : If we have a joint Gibbs distribution, then we have a Markov Random Field 16 Areal Modelling

CAR models Conditionally Auto-Regressive (CAR) models Gaussian (autonormal) case   � b ij y j , τ 2  p ( y i | y j , j � = i ) = N i j Using Brook’s Lemma we can obtain � � − 1 2 y ′ D − 1 ( I − B ) y p ( y 1 , y 2 , ...y n ) ∝ exp where B = { b ij } and D is diagonal with D ii = τ 2 i . Suggests a multivariate normal distribution with µ y = 0 and Σ Y = ( I − B ) − 1 D D − 1 ( I − B ) symmetric requires b ij = b ji for all i, j τ 2 τ 2 i j 17 Areal Modelling

CAR models Returning to W , let b ij = w ij /w i + and τ 2 i = τ 2 /w i + , so p ( y 1 , y 2 , ...y n ) ∝ exp {− 1 2 τ 2 y ′ ( D w − W ) y } where D w is diagonal with ( D w ) ii = w i + and thus     �  − 1 w ij ( y i − y j ) 2 p ( y 1 , y 2 , ...y n ) ∝ exp 2 τ 2  i � = j Caution: ( D w − W ) 1 = 0 . Intrinsic autoregressive (IAR) model; improper, so requires a constraint (e.g., � i y i = 0 ) Not a valid data model, but only as a random effects model! 18 Areal Modelling

CAR models With τ 2 unknown, what should be the power of τ 2 ? Answer: p ( y 1 , y 2 , ...y n ) ∝ ( 1 τ 2 ) ( n − G ) / 2 exp {− 1 2 τ 2 y ′ ( D w − W ) y } , where G is the number of “islands” in the map. In fact, n − G is the rank of D w − W . The impropriety can be remedied in an obvious way. Redefine the CAR as: p ( y 1 , y 2 , ...y n ) ∝ | D w − ρW | 1 / 2 exp {− 1 2 τ 2 y ′ ( D w − ρW ) y } , where ρ is chosen to make D w − ρW non-singular. This is � � guaranteed if ρ ∈ 1 /λ (1) , 1 , where λ (1) is the minimum eigenvalue of D − 1 / 2 WD − 1 / 2 . In practice, the bound ρ ∈ (0 , 1) is often preferred. 19 Areal Modelling

CAR models To ρ or not to ρ ? Advantages: makes distribution proper adds parametric flexibility ρ = 0 interpretable as independence Disadvantages: why should we expect y i to be a proportion of average of neighbors - sensible spatial interpretation? calibration of ρ as a correlation, e.g., ρ = 0 . 80 yields 0 . 1 ≤ I ≤ 0 . 15 , ρ = 0 . 90 yields 0 . 2 ≤ I ≤ 0 . 25 , ρ = 0 . 99 yields I ≤ 0 . 5 So, used with random effects, scope of spatial pattern may be limited 20 Areal Modelling

Spatial Autoregressive Models Sudipto Banerjee Division of - PowerPoint PPT Presentation

Spatial Autoregressive Models Sudipto Banerjee Division of Biostatistics, University of Minnesota September 22, 2009 1 Areal Modelling Areal unit data Maps of raw standard mortality ratios (SMR) of lung and esophagus cancer between 1991 and

Partial MLE, Marginal Effects and Asymptotics for Spatial Autoregressive Nonlinear Probit Models

Autoregressive Models Autoregressive Models In [1]: from mxnet import autograd, nd, gluon, init

Adaptive Estimation of Autoregressive Models with Time-Varying Variances Ke-Li Xu and Peter C.

Financial Econometrics Econ 40357 ARIMA Part 2: Autoregressive Models N.C. Mark University of

Adversarial Attacks on Probabilistic Autoregressive Forecasting Models 2 (i) Probabilistic

Causal analysis within the framework of structural autoregressive models Alessio Moneta Scuola

CSC321 Lecture 20: Reversible and Autoregressive Models Roger Grosse Roger Grosse CSC321

Time Domain Models Box & Jenkins popularized an approach to time series analysis based on

ardl: Estimating autoregressive distributed lag and equilibrium correction models Sebastian

Efficient estimators in nonlinear and heteroscedastic autoregressive models with constraints

On a Class of Nonparametric Bayesian Autoregressive Models Maria Anna Di Lucca 1 , Alessandra

Estimation of High-dimensional Vector Autoregressive (VAR) models George Michailidis Department

Deep Autoregressive Models mainly PixelCNN and Wavenet 1 Another Way to Generate

Hint-Based Training for Non-Autoregressive Translation Zhuohan Li Zi Lin Fei Tian Tao Qin

Autoregressive Models Overview Direct Structures P Direct structures x ( n ) = a k x

Variance bounds for estimators in autoregressive models with constraints Wolfgang Wefelmeyer

EC3062 ECONOMETRICS DYNAMIC REGRESSIONS MODELS Autoregressive Disturbance Processes Economic

ardl: Stata module to estimate autoregressive distributed lag models Sebastian Kripfganz 1 Daniel

Chapter 4: Video 1 - Supplemental slides The Autoregressive Model Autoregressive (AR) processes

Learning unknown forces in nonlinear models with Gaussian processes and autoregressive flows Wil

CoT: Cooperative Training for Generative Modeling of Discrete Data

Bayesian Graphical Models for Structural Vector Autoregressive Processes Daniel Ahelegbey, Monica

TESTING THE TAIL INDEX IN AUTOREGRESSIVE MODELS Jana Jure ckov a, Charles University in

Markov-switching autoregressive latent variable models for longitudinal data University of

Spatial Autoregressive Models Sudipto Banerjee Division of - PowerPoint PPT Presentation

Spatial Autoregressive Models Sudipto Banerjee Division of Biostatistics, University of Minnesota September 22, 2009 1 Areal Modelling Areal unit data Maps of raw standard mortality ratios (SMR) of lung and esophagus cancer between 1991 and

Partial MLE, Marginal Effects and Asymptotics for Spatial Autoregressive Nonlinear Probit Models

Autoregressive Models Autoregressive Models In [1]: from mxnet import autograd, nd, gluon, init

Adaptive Estimation of Autoregressive Models with Time-Varying Variances Ke-Li Xu and Peter C.

Financial Econometrics Econ 40357 ARIMA Part 2: Autoregressive Models N.C. Mark University of

Adversarial Attacks on Probabilistic Autoregressive Forecasting Models 2 (i) Probabilistic

Causal analysis within the framework of structural autoregressive models Alessio Moneta Scuola

CSC321 Lecture 20: Reversible and Autoregressive Models Roger Grosse Roger Grosse CSC321

Time Domain Models Box &amp; Jenkins popularized an approach to time series analysis based on

ardl: Estimating autoregressive distributed lag and equilibrium correction models Sebastian

Efficient estimators in nonlinear and heteroscedastic autoregressive models with constraints

On a Class of Nonparametric Bayesian Autoregressive Models Maria Anna Di Lucca 1 , Alessandra

Estimation of High-dimensional Vector Autoregressive (VAR) models George Michailidis Department

Deep Autoregressive Models mainly PixelCNN and Wavenet 1 Another Way to Generate

Hint-Based Training for Non-Autoregressive Translation Zhuohan Li Zi Lin Fei Tian Tao Qin

Autoregressive Models Overview Direct Structures P Direct structures x ( n ) = a k x

Variance bounds for estimators in autoregressive models with constraints Wolfgang Wefelmeyer

EC3062 ECONOMETRICS DYNAMIC REGRESSIONS MODELS Autoregressive Disturbance Processes Economic

ardl: Stata module to estimate autoregressive distributed lag models Sebastian Kripfganz 1 Daniel

Chapter 4: Video 1 - Supplemental slides The Autoregressive Model Autoregressive (AR) processes

Learning unknown forces in nonlinear models with Gaussian processes and autoregressive flows Wil

CoT: Cooperative Training for Generative Modeling of Discrete Data

Bayesian Graphical Models for Structural Vector Autoregressive Processes Daniel Ahelegbey, Monica

TESTING THE TAIL INDEX IN AUTOREGRESSIVE MODELS Jana Jure ckov a, Charles University in

Markov-switching autoregressive latent variable models for longitudinal data University of

Time Domain Models Box & Jenkins popularized an approach to time series analysis based on