Lecture 18 Models for areal data Colin Rundel 03/22/2017 1
areal / lattice data 2
Example - NC SIDS 3 SID79
E I 2 E I 2 Some properties of Moran’s I (when there is no spatial autocorrelation): • E I 1 n 1 • Var I EDA - Moran’s I y Something ugly but closed form • Asymptotically, I E I Var I 0 1 where w is a spatial weights matrix. 4 n y y If we have observations at n spatial locations ( s 1 , . . . s n ) ( )( ) ∑ n ∑ n y ( s i ) − ¯ y ( s j ) − ¯ i = 1 j = 1 w ij I = ∑ n ∑ n ∑ n ( y ( s i ) − ¯ ) i = 1 j = 1 w ij i = 1
EDA - Moran’s I y • Asymptotically, Some properties of Moran’s I (when there is no spatial autocorrelation): where w is a spatial weights matrix. y y 4 n If we have observations at n spatial locations ( s 1 , . . . s n ) ( )( ) ∑ n ∑ n y ( s i ) − ¯ y ( s j ) − ¯ i = 1 j = 1 w ij I = ∑ n ∑ n ∑ n ( y ( s i ) − ¯ ) i = 1 j = 1 w ij i = 1 • E ( I ) = − 1 / ( n − 1 ) • Var ( I ) = E ( I 2 ) − E ( I ) 2 = Something ugly but closed form I − E ( I ) √ Var ( I ) ∼ N ( 0 , 1 )
NC SIDS & Moran’s I Moran.I (nc$SID74, weight = 1* st_touches (nc, sparse=FALSE)) %>% str () $ p.value : num 0.0118 ## : num 0.0627 $ sd ## $ expected: num -0.0101 ## $ observed: num 0.148 ## ## List of 4 library (ape) Lets start by using an adjacency matrix for w (shared county borders). ## [1] 0.119089 morans_I (y = nc$SID74, w = 1* st_touches (nc, sparse=FALSE)) } (n/ sum (w)) * (num/denom) { 5 morans_I = function (y, w) n = length (y) y_bar = mean (y) num = sum (w * (y-y_bar) %*% t (y-y_bar)) denom = sum ( (y-y_bar)^2 )
EDA - Geary’s C y • Geary’s C is inversely related to Moran’s I 1 then positive spatial autocorrelation • If C 1 then negative spatial autocorrelation • If C 1 then no spatial autocorrelation • If C 2 C • 0 Some properties of Geary’s C: where w is a spatial weights matrix. 6 Like Moran’s I, if we have observations at n spatial locations ( s 1 , . . . s n ) ∑ n ∑ n ( y ( s i ) − y ( s j ) ) 2 n − 1 i = 1 j = 1 w ij C = 2 ∑ n ∑ n ∑ n ( y ( s i ) − ¯ ) i = 1 j = 1 w ij i = 1
EDA - Geary’s C y • Geary’s C is inversely related to Moran’s I Some properties of Geary’s C: where w is a spatial weights matrix. 6 Like Moran’s I, if we have observations at n spatial locations ( s 1 , . . . s n ) ∑ n ∑ n ( y ( s i ) − y ( s j ) ) 2 n − 1 i = 1 j = 1 w ij C = 2 ∑ n ∑ n ∑ n ( y ( s i ) − ¯ ) i = 1 j = 1 w ij i = 1 • 0 < C < 2 • If C ≈ 1 then no spatial autocorrelation • If C > 1 then negative spatial autocorrelation • If C < 1 then positive spatial autocorrelation
NC SIDS & Geary’s C Again using an adjacency matrix for w (shared county borders). gearys_C = function (y, w) { ((n-1)/(2* sum (w))) * (num/denom) } gearys_C (y = nc$SID74, w = 1* st_touches (nc, sparse=FALSE)) ## [1] 0.8898868 7 n = length (y) y_bar = mean (y) y_i = y %*% t ( rep (1,n)) y_j = t (y_i) num = sum (w * (y_i-y_j)^2) denom = sum ( (y-y_bar)^2 )
Spatial Correlogram d = nc %>% st_centroid () %>% st_distance () %>% strip_class () ) gearys = map_dbl (adj_mats, gearys_C, y = nc$SID74) morans = map_dbl (adj_mats, morans_I, y = nc$SID74), = breaks[-1], dist ) } ‘diag<-‘(0) matrix (ncol=100) %>% { function (l) levels (d_cut), 8 breaks = seq (0, max (d), length.out = 21) d_cut = cut (d, breaks) adj_mats = map ( (d_cut == l) %>% d = data_frame (
9 morans gearys 0.6 0.4 1.0 var value morans 0.2 gearys 0.5 0.0 0.0 2e+05 4e+05 6e+05 2e+05 4e+05 6e+05 dist
10 1.0 gearys 0.5 0.0 0.0 0.2 0.4 0.6 morans
Autoregressive Models 11
12 AR Models - Time Lets just focus on the simplest case, an AR ( 1 ) process y t = δ + ϕ y t − 1 + w t where w t ∼ N ( 0 , σ 2 ) and | ϕ | < 1, then δ E ( y t ) = 1 − ϕ σ 2 Var ( y t ) = 1 − ϕ
f y 1 f y 2 y 1 f y 3 y 2 y 1 f y n y n 1 y n f y 1 f y 2 y 1 f y 3 y 2 f y n y n 1 . . . . AR Models - Time - Joint Distribution 1 1 . . . . . . ... . 1 . 1 In writing down the likelihood we also saw that an AR 1 is 1st order Markovian, f y 1 y n 2 y 1 1 1 13 y 1 y n multivariate normal distribution . . . y 2 Previously we saw that an AR ( 1 ) model can be represented using a ϕ n − 1 ϕ · · · ϕ n − 2 · · · ϕ δ σ 2 ∼ N , 1 − ϕ 1 − ϕ ϕ n − 1 ϕ n − 2 · · ·
AR Models - Time - Joint Distribution . . . . 1 1 1 . . . 1 . . ... . . . 1 Markovian, 1 13 . y 2 multivariate normal distribution y n . . y 1 Previously we saw that an AR ( 1 ) model can be represented using a ϕ n − 1 ϕ · · · ϕ n − 2 · · · ϕ δ σ 2 ∼ N , 1 − ϕ 1 − ϕ ϕ n − 1 ϕ n − 2 · · · In writing down the likelihood we also saw that an AR ( 1 ) is 1st order f ( y 1 , . . . , y n ) = f ( y 1 ) f ( y 2 | y 1 ) f ( y 3 | y 2 , y 1 ) · · · f ( y n | y n − 1 , y n − 2 , . . . , y 1 ) = f ( y 1 ) f ( y 2 | y 1 ) f ( y 3 | y 2 ) · · · f ( y n | y n − 1 )
Competing Definitions for y t vs. In the case of time, both of these definitions result in the same multivariate distribution for y . 14 y t = δ + ϕ y t − 1 + w t y t | y t − 1 ∼ N ( δ + ϕ y t − 1 , σ 2 )
Competing Definitions for y t vs. In the case of time, both of these definitions result in the same multivariate distribution for y . 14 y t = δ + ϕ y t − 1 + w t y t | y t − 1 ∼ N ( δ + ϕ y t − 1 , σ 2 )
f y s 10 y s 9 f y s 1 y s 2 neighborhoods. We will define N s i to be the set of neighbors of location s i . s 2 s 4 s 1 s 2 s 3 s 4 • If we use distance within 2 units then N s 3 N s 3 • If we define the neighborhood based on “touching” then y s 8 Instead we need to think about things in terms of their neighbors / y s 10 y s 3 • etc. y s 10 f y s 9 f y s 10 y s 1 AR in Space 15 y s 1 f y s 2 f y s 1 y s 10 f y s 1 Even in the simplest spatial case there is no clear / unique ordering, s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
neighborhoods. We will define N s i to be the set of neighbors of location s i . s 2 s 4 s 1 s 2 s 3 s 4 AR in Space Even in the simplest spatial case there is no clear / unique ordering, • etc. • If we use distance within 2 units then N s 3 N s 3 • If we define the neighborhood based on “touching” then Instead we need to think about things in terms of their neighbors / 15 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 f ( y ( s 1 ) , . . . , y ( s 10 ) ) = f ( y ( s 1 ) ) f ( y ( s 2 ) | y ( s 1 ) ) · · · f ( y ( s 10 | y ( s 9 ) , y ( s 8 ) , . . . , y ( s 1 ) ) = f ( y ( s 10 ) ) f ( y ( s 9 ) | y ( s 10 ) ) · · · f ( y ( s 1 | y ( s 2 ) , y ( s 3 ) , . . . , y ( s 10 ) ) = ?
AR in Space Even in the simplest spatial case there is no clear / unique ordering, • etc. • If we define the neighborhood based on “touching” then Instead we need to think about things in terms of their neighbors / 15 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 f ( y ( s 1 ) , . . . , y ( s 10 ) ) = f ( y ( s 1 ) ) f ( y ( s 2 ) | y ( s 1 ) ) · · · f ( y ( s 10 | y ( s 9 ) , y ( s 8 ) , . . . , y ( s 1 ) ) = f ( y ( s 10 ) ) f ( y ( s 9 ) | y ( s 10 ) ) · · · f ( y ( s 1 | y ( s 2 ) , y ( s 3 ) , . . . , y ( s 10 ) ) = ? neighborhoods. We will define N ( s i ) to be the set of neighbors of location s i . N ( s 3 ) = { s 2 , s 4 } • If we use distance within 2 units then N ( s 3 ) = { s 1 , s 2 , s 3 , s 4 }
Defining the Spatial AR model 1 1 Here we will consider a simple average of neighboring observations, just like • Conditional Autoregressive (CAR) 16 autoregressive process, with the temporal AR model we have two options in terms of defining the • Simultaneous Autogressve (SAR) ∑ y ( s ′ ) + N ( 0 , σ 2 ) y ( s ) = δ + ϕ | N ( s ) | s ′ ∈ N ( s ) y ( s ′ ) , σ 2 ∑ y ( s ) | y − s ∼ N δ + ϕ | N ( s ) | s ′ ∈ N ( s )
2 I Simultaneous Autogressve (SAR) ij 0 where W y y then we can write y as follows, otherwise 0 N s i if j N s i 1 W Using First we need to define a weight matrix W where . 1 17 y ( s ′ ) + N ( 0 , σ 2 ) ∑ y ( s ) = δ + ϕ | N ( s ) | s ′ ∈ N ( s ) ( ) t we want to find the distribution of y = y ( s 1 ) , y ( s 2 ) , . . . , y ( s n )
2 I Simultaneous Autogressve (SAR) . 0 where W y y then we can write y as follows, otherwise 0 Using First we need to define a weight matrix W where 17 1 y ( s ′ ) + N ( 0 , σ 2 ) ∑ y ( s ) = δ + ϕ | N ( s ) | s ′ ∈ N ( s ) ( ) t we want to find the distribution of y = y ( s 1 ) , y ( s 2 ) , . . . , y ( s n ) { 1 / | N ( s i ) | if j ∈ N ( s i ) { W } ij =
Recommend
More recommend