BAYESIAN MODEL SELECTION IN SPATIAL LATTICE MODELS Victor De Oliveira Department of Management Science and Statistics The University of Texas at San Antonio San Antonio, TX USA victor.deoliveira@utsa.edu http://faculty.business.utsa.edu/vdeolive Joint work with J.J. Song The Fourth Erich L. Lehmann Symposium, May 9–12, 2011
Example 1: Phosphate Data Raw phosphate concentrations (in mg P/100 g of soil) collected over 16 by 16 regular lattice during several years in archaeological region of Greece 121 112 108 91 68 59 294 50 101 27 71 48 36 71 66 83 15 108 101 75 83 52 55 50 30 55 75 108 41 47 47 62 80 50 88 77 77 73 50 50 59 57 55 57 38 71 17 52 60 91 166 68 60 32 47 45 34 57 60 64 68 32 48 27 88 116 66 34 62 77 41 23 38 68 68 73 33 60 66 62 143 60 62 80 59 75 57 27 57 10 55 53 80 80 62 91 71 68 77 104 75 41 33 131 41 37 s 2 (meters) 64 45 62 21 60 38 47 77 73 62 27 44 53 53 52 36 64 28 44 45 60 62 34 47 75 83 71 77 83 73 77 59 59 38 32 55 60 30 41 59 57 71 66 83 85 85 77 83 45 47 48 68 80 44 64 64 68 68 88 116 108 85 91 73 37 41 38 36 19 57 47 131 80 83 80 88 73 73 97 62 5 31 45 34 66 71 85 80 121 91 136 108 108 80 80 73 55 34 62 41 80 75 101 50 71 91 94 94 91 75 68 59 57 55 66 40 57 68 73 80 125 83 66 77 55 71 71 47 77 59 45 55 59 60 48 68 71 57 60 55 53 57 62 64 0 0 5 10 15 s 1 (meters)
Example 2: Crime Data Homicide rates per 100 , 000 habitants for 1980 in the south of US, with n = 1412 counties 40 35 latitude 30 0.00−4.80 4.80−7.85 7.85−10.92 10.92−15.03 15.03−42.34 25 −105 −100 −95 −90 −85 −80 −75 longitude
Models for Spatial Lattice Data • Conditional Autoregressive (CAR) Models: Mostly studied and applied in Statistical literature • Simultaneously Autoregressive (SAR) Models: Mostly studied and applied in Econometric/geography literature All of these require specifying a neighborhood system
Neighborhood Systems Sites { 1 , . . . , n } are endowed with neighborhood system, { N i : i = 1 , . . . , n } , where N i = neighbors of site i . Examples: N i = { j : site j shares a boundary with site i } N i = { j : 0 < d ij < r } with r > 0 and d ij the distance between sites i and j
First and second order neighborhood systems X X
Goal Model selection for spatial lattice data using a default Bayesian approach, where the competing models: • Have the same mean structure • Have different covariance structures
CAR MODELS Conditional Specification: For i = 1 , . . . , n n j β ) , τ 2 ( Y i | Y ( i ) ) ∼ N( x ′ c ij ( Y j − x ′ � i β + i ) j =1 • Y ( i ) = { Y j , j � = i } • x ′ j = ( x j 1 , . . . , x jp ) • β ∈ R p , τ i > 0 • c ij ≥ 0 and c ij > 0 iff i ∼ j
Let M = diag( τ 2 1 , . . . , τ 2 n ) and C = ( c ij ) satisfy • M − 1 C is symmetric, so c ij τ 2 j = c ji τ 2 i • M − 1 ( I n − C ) positive definite Joint Specification: Y ∼ N n ( X β , ( I n − C ) − 1 M ) where X = ( x 1 , . . . , x n ) ′
Parameterization • M = σ 2 G , with σ 2 > 0 unknown and G diagonal (known) • C = φW , with φ ‘spatial parameter’ and W = ( w ij ) nonnegative “weight” known matrix (not necessarily symmetric), and w ij > 0 iff i ∼ j Let A = ( a ij ) [neighborhood matrix]: a ij = 1 if i ∼ j , and a ij = 0 otherwise
Classes of CAR Models • Homogeneous CAR (HCAR): G = I n , W = A • Weighted CAR (WCAR) (Besag et al. 1991): G = diag( | N 1 | − 1 , . . . , | N n | − 1 ) , W = GA with | N i | = � n j =1 a ij • Autocorrelation CAR (ACAR) (Cressie & Chang, 1989): G = diag( | N 1 | − 1 , . . . , | N n | − 1 ) W = G 1 / 2 AG − 1 / 2 ,
Facts Assume the above conditions hold and G − 1 M is symmetric. Then: (a) G − 1 / 2 WG 1 / 2 is symmetric (b) G − 1 / 2 WG 1 / 2 and W have the same nonzero eigenvalues, and all are real (c) M and C determine a CAR model iff σ 2 > 0 and n , λ − 1 φ ∈ ( λ − 1 1 ), with λ 1 ≥ . . . ≥ λ n ordered eigenvalues of G − 1 / 2 WG 1 / 2 Parameter space: Ω = R p × (0 , ∞ ) × ( λ − 1 n , λ − 1 1 )
SAR MODELS Conditional Specification: For i = 1 , . . . , n n Y i = x ′ b ij ( Y j − x ′ � i β + j β ) + ǫ i j =1 • ǫ i ∼ N(0 , ξ 2 i ), independent • β ∈ R p , ξ i > 0 • b ij ≥ 0 and b ij > 0 iff i ∼ j Let M = diag( ξ 2 1 , . . . , ξ 2 n ) and B = ( b ij ) satisfy that I n − B is nonsingular. Then Joint Specification: Y ∼ N n ( X β , ( I n − B ) − 1 M ( I n − B ′ ) − 1 )
Particular Model: • M = σ 2 I n • B = φA so Y ∼ N n ( X β , σ 2 (( I n − φA ) 2 ) − 1 Ω = R p × (0 , ∞ ) × ( λ − 1 n , λ − 1 Parameter space: 1 ), with λ 1 ≥ . . . ≥ λ n the ordered eigenvalues of A
MODEL SELECTION Let M 1 , M 2 , . . . , M k be the candidate models ( k ≥ 2) M j is either HCAR, WCAR, ACAR or SAR parameterized by η j = ( β , σ 2 j , φ j ) ∈ Ω j with covariance depending on G j and A j φ j ∈ (1 /λ ( j ) n , 1 /λ ( j ) 1 ) with λ ( j ) ≥ λ ( j ) ≥ . . . ≥ λ ( j ) eigenvalues of: n 1 2 • A j in case of HCAR, ACAR and SAR • G 1 / 2 A j G 1 / 2 in case of WCAR j j The approach proposed here assumes all models have the same mean structure
Likelihood for M j L j ( η j ; y ) = 1 j ) − n 1 2 | Σ − 1 ( y − X β ) ′ Σ − 1 (2 πσ 2 2 exp { − φ j | φ j ( y − X β ) } 2 σ 2 j where I n − φ j A j for HCAR models G − 1 − φ j A j for WCAR models j Σ − 1 φ j = − φ j G − 1 / 2 A j G − 1 / 2 G − 1 for ACAR models j j j ( I n − φ j A j ) 2 for SAR models
Prior for M j π ( η j | M j ) ∝ π ( φ j | M j ) 1 Ω j ( η j ) σ 2 j Two options for π ( φ j | M j ): • Uniform: π U ( φ j | M j ) = 1 (1 /λ ( j ) 1 ) ( φ j ) n , 1 /λ ( j ) • Independence Jeffreys: π J 1 ( φ j | M j ) = � n � 1 n λ ( j ) λ ( j ) 2 ) 2 − 1 � � ] 2 i i 1 ) ( φ j ) ( n [ 1 (1 /λ ( j ) n , 1 /λ ( j ) 1 − φ j λ ( j ) 1 − φ j λ ( j ) i =1 i i =1 i (De Oliveira & Song, 2008; De Oliveira, 2011)
(a) 12 prior indep. Jeffreys Jeffreys−rule 10 uniform 8 π ( φ ) 6 4 2 0 −0.2 −0.1 0.0 0.1 0.2 φ
Bayes Factors & Posterior Model Probabilities π ( M i | y ) m ( y | M i ) π ( M i ) = π ( M j | y ) m ( y | M j ) π ( M j ) = B ij × prior odds ij where � m ( y | M j ) = L j ( η j | y ) π ( η j | M j ) d η j , Ω j and B ij = m ( y | M i ) m ( y | M j ) Hence − 1 k π ( M l ) � π ( M j | y ) = π ( M j ) B lj , j = 1 , . . . , k l =1 m ( y | M j ) when π ( M j ) = 1 = , � k k l =1 m ( y | M l )
Remarks • Bayes factors and posterior model probabilities are, in general, undetermined when improper priors are used • Important exception occurs when competing models have same invariance structure, up to individual model parameters that have proper priors (Berger et al., 1998) • CAR and SAR models fit this exception when all the competing models have the same mean structure and π ( φ j | M j ) is proper
Fact As φ j → 1 /λ ( j ) ; i = 1 or n i π J 1 ( φ j | M j ) = O ((1 − φ j λ ( j ) ) − 1 ) i so π J 1 ( φ j | M j ) is not integrable (De Oliveira & Song, 2008). Instead we use ( π J 1 ( φ j | M j )) r , with r < 1, which is proper and has the same “shape”.
For j = 1 , . . . , k : � 1 /λ ( j ) 1 m ( y | M j ) = Kc j h ( φ j , M j , y ) dφ j 1 /λ ( j ) n where h ( φ j , M j , y ) = | Σ − 1 φ j | 1 / 2 | X ′ Σ − 1 φ j X | − 1 / 2 ( S 2 φ j ) − ( n − p ) / 2 π ( φ j | M j ) β φ j ) ′ Σ − 1 S 2 φ j = ( y − X ˆ φ j ( y − X ˆ β φ j ) β φ j = ( X ′ Σ − 1 φ j X ) − 1 X ′ Σ − 1 ˆ φ j y − 1 K = Γ( n − p � 1 /λ ( j ) 2 ) 1 , c j = π ( φ j | M j ) dφ j n − p 1 /λ ( j ) π n 2
Note • For posterior model probabilities to be well defined and calibrated, the proportionality constants in the like- lihoods and priors of all competing models should be retained • Computation of m ( y | M j ) involves one-dimensional integration over a bounded interval
Computation • Computation of ˆ c j straightforward: numerical quadrature or Monte Carlo � − 1 � m ( 1 1 ) 1 � ( π J 1 ( φ ( l ) | M j )) 1 / 2 c j = ˆ − j λ ( j ) λ ( j ) m n 1 l =1 iid with φ (1) , . . . , φ ( m ) ∼ unif(1 /λ ( j ) n , 1 /λ ( j ) 1 ) j j • Computation of m ( y | M j ) requires more care: h ( φ j , M j , y ) is highly peaked and concentrated near the right boundary for moderate or large sample sizes. Hence almost constant and very close to zero over most of the integration region, and common numerical quadrature or Monte Carlo estimates are often zero.
100 30 40 25 80 20 30 60 π ( φ |y) π ( φ |y) π ( φ |y) 15 20 40 10 10 20 5 0 0 0 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 φ φ φ 100 30 40 25 80 20 30 60 π ( φ |y) π ( φ |y) π ( φ |y) 15 20 40 10 10 20 5 0 0 0 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 φ φ φ
Recommend
More recommend