Exploring Large Regression Model Spaces via Trans-dimensional Genetic Algorithms Ricardo S. Ehlers ICMC - USP http://www.icmc.usp.br/ ∼ ehlers ehlers@icmc.usp.br Joint work with Marco A.R. Ferreira, University of Missouri.
UFSCar, April 2009 Searching for the “Best” Model(s) • Supose that the number M of alternative models is quite large. E.g. linear model with 19 possible covariates: 2 19 = 524288 alternative models (with no interations). • Enumerate, estimate and associate a measure of fit and parsimony to each possible model may not be the best strategy. • How to compare competing models? • How to make average inference using the competing models (or a subset of this)? Ricardo Ehlers Exploring Large Regression Model Spaces 2
UFSCar, April 2009 Bayesian Approach • Models M 1 , . . . , M k are assigned a priori probabilities p ( M i ) . • For each model θ i ∈ R n i with: – a likelihood function p ( y | θ i , M i ) – a prior distribution p ( θ i | M i ) . • By Bayes Theorem, π ( M i , θ i ) ∝ p ( y | θ i , M i ) p ( θ i | M i ) p ( M i ) p ( M i | y ) ∝ p ( y | M i ) p ( M i ) � p ( y | M i ) = p ( y | θ i , M i ) p ( θ i | M i ) d θ i Ricardo Ehlers Exploring Large Regression Model Spaces 3
UFSCar, April 2009 Approaches AIC (ˆ θ i , M i ) = − 2 log p ( y | ˆ • Akaike (1974) θ i , M i ) + 2 n i • Schwartz (1978) BIC (ˆ θ i , M i ) = − 2 log p ( y | ˆ θ i , M i ) + n i log T • Spiegelhalter et al. (2002) DIC ( θ i , M i ) = − 2 log p ( y | θ i , M i ) + 2 p D � n i =1 ( µ i − y i,obs ) 2 + � n γ i =1 σ 2 • Gelfand and Ghosh (1998) D γ = i , γ +1 • George and McCulloch (1993) SSVS • Chen (2005) , Chib (1995) , Chib and Jeliazkov (2001) , Friel and Pettit (2008) Estimating the marginal likelihood. Ricardo Ehlers Exploring Large Regression Model Spaces 4
UFSCar, April 2009 Genetic Algorithms Holland (1975) , Chatterjee, Laudato, and Lynch (1996) x 11 . . . x 1 k . . . x 1 L . . . . . . Apply genetic opera- . . . tors to transform the x i 1 . . . x ik . . . x iL A population of M individuals population. . . . . . . . . . each of dimension L . x j 1 . . . x jk . . . x jL Selection, crossover, . . . . . . . . . mutation x M 1 . . . x Mk . . . x ML Ricardo Ehlers Exploring Large Regression Model Spaces 5
UFSCar, April 2009 Trans-dimensional Jumps Green (1995) • Propose a jump from model M i to model M j w.p. r ij , • generate a vector u of dimension n j − n i from q () , • set θ j = f ij ( θ i , u ) where f ij : Θ i × R n j − n i → Θ j denotes a bijective function. • Accept the jump w.p. min(1 , A ) where � � A = π ( θ j , M j ) r ji ∂f ij ( θ i , u ) � � � � π ( θ i , M i ) r ij q ( u ) ∂ ( θ i , u ) � � � �� � � �� � proposal ratio target ratio Choice of proposal distribution q is crucial to cover model and parameter spaces. Ricardo Ehlers Exploring Large Regression Model Spaces 6
UFSCar, April 2009 We assume that: • θ i | M i is easy to estimate using standard methods and software. • Posterior distribution on model space is well approximated by P ( M k | y ) ∝ exp {− BIC (ˆ θ k , k ) / 2 } . BIC (ˆ θ k , k ) = − 2 log p ( y | ˆ θ k , k ) + n k log T . ˆ θ k : maximum likelihood estimate under model M k . Ricardo Ehlers Exploring Large Regression Model Spaces 7
UFSCar, April 2009 RJMCMC + Genetic Algorithms g ( E ( Y )) = β 0 + β j 1 x j 1 + · · · + β j k x j k , k = 0 , . . . , k max Given a population of models Z = ( z 1 , . . . , z M ) where z ij = 0 , 1 , 1. propose a new population z ′ via genetic operators (esp. mutation and crossover), 2. accept the new population with probability, � � 1 , exp {− BIC ( z ′ ) / 2 } P ( z ′ , z ) min P ( z , z ′ ) exp {− BIC ( z ) / 2 } where P ( z , z ′ ) = Pr ( proposing a jump from population z to z ′ ) Ricardo Ehlers Exploring Large Regression Model Spaces 8
UFSCar, April 2009 Crossover Move Combine pairs of models to generate offsprings more likely to be accepted if they have high performance. Randomly choose a pair of individuals z i , z j and propose a new population as follows, 1. select those elements with different values K = { k : z ik � = z jk } 2. randomly choose k ∈ K 3. set z ′ ik = z jk and z ′ jk = z ik 4. Accept this new population with probability � � 1 , exp( − BIC ( z ′ i ) / 2 − BIC ( z ′ j ) / 2) P ( z ′ , z ) min P ( z , z ′ ) exp( − BIC ( z i ) / 2 − BIC ( z j ) / 2) Repeat this updating scheme for all [ M/ 2] pairs selected without replacement from the population. Ricardo Ehlers Exploring Large Regression Model Spaces 9
UFSCar, April 2009 Mutation Move Include new regressor w.p. w , or delete an existing one w.p. 1 − w . Suppose we are updating z i and propose an inclusion. Define R 0 = { j : z ij = 0 } and R 1 = { j : z ij = 1 } . Then, 1. randomly choose j ∈ R 0 and set z ′ ij = 1 2. accepted this move w.p. min(1 , A ) where A = exp( − BIC ( z ′ i ) / 2) (1 − w ) | R 0 | exp( − BIC ( z i ) / 2) w ( | R 1 | + 1) and | J | denotes the cardinality of J . Likewise, if a deletion is proposed 1. choose j ∈ R 1 and set z ′ ij = 0 . 2. accept the move w.p. min(1 , A − 1 ) . Repeat this updating scheme for all z 1 , . . . , z M . Ricardo Ehlers Exploring Large Regression Model Spaces 10
UFSCar, April 2009 Example - linear regression Effect of punishment regimes on crime rates in 47 US states, 15 potential regressors. (Raftery, Painter, and Volinsky 2005) . M percentage of males aged 14-24 So indicator variable for a southern state Ed mean years of schooling Po1 police expenditure in 1960 Po2 police expenditure in 1959 LF labour force participation rate M.F number of males per 1000 females Pop state population NW number of nonwhites per 1000 people U1 unemployment rate of urban males 14-24 U2 unemployment rate of urban males 35-39 GDP gross domestic product per head Ineq income inequality Prob probability of imprisonment Time average time served in state prisons Ricardo Ehlers Exploring Large Regression Model Spaces 11
UFSCar, April 2009 Probs 0.209 0.122 0.060 0.055 0.053 0.036 0.026 0.025 0.023 0.022 Prob.inc M 1 1 1 1 1 1 1 1 1 1 0.9890 0 0 0 0 0 0 0 0 0 0 So 0.0549 Ed 1 1 1 1 1 1 1 1 1 1 1.0000 Po1 1 1 1 1 0 0 1 1 1 0 0.7714 0 0 0 0 1 1 0 0 0 1 Po2 0.2459 LF 0 0 0 0 0 0 0 0 0 0 0.0290 0 0 0 0 0 0 0 0 0 0 M.F 0.0347 Pop 0 0 0 1 0 0 0 0 0 0 0.2049 1 1 1 1 1 1 1 1 0 1 NW 0.9227 U1 0 0 0 0 0 0 0 1 0 0 0.0889 1 1 1 1 1 1 0 1 1 1 U2 0.8891 GDP 0 0 1 0 0 0 0 0 0 1 0.2414 1 1 1 1 1 1 1 1 1 1 Ineq 1.0000 Prob 1 1 1 1 1 1 1 1 1 1 0.9956 1 0 1 0 0 1 1 1 0 1 Time 0.4963 Ricardo Ehlers Exploring Large Regression Model Spaces 12
UFSCar, April 2009 Models visited by GA−MCMC M So Ed Po1 Po2 LF M.F Pop NW U1 U2 GDP Ineq Prob Time 1 2 3 4 5 7 12 21 54 Model Ricardo Ehlers Exploring Large Regression Model Spaces 13
UFSCar, April 2009 Example - Logistic Regression Risk factors associated with low infant birth weight ( Hosmer and Lemeshow 1989 ). y i ∼ Bernoulli ( π i ) where π i is the i th baby probability of low weight at birth. Under model k this is associate with the covariates as � � π i = X ′ log i θ . 1 − π i Ricardo Ehlers Exploring Large Regression Model Spaces 14
UFSCar, April 2009 Model Covariates indicator Model indicator age lwt race smoke ptl ht ui ftv probability 35 0 1 0 0 0 1 0 0 0.0962 99 0 1 0 0 0 1 1 0 0.0673 51 0 1 0 0 1 1 0 0 0.0600 43 0 1 0 1 0 1 0 0 0.0599 107 0 1 0 1 0 1 1 0 0.0333 3 0 1 0 0 0 0 0 0 0.0294 115 0 1 0 0 1 1 1 0 0.0287 17 0 0 0 0 1 0 0 0 0.0239 19 0 1 0 0 1 0 0 0 0.0202 47 0 1 1 1 0 1 0 0 0.0202 Inclusion 0.190 0.696 0.140 0.381 0.349 0.659 0.376 0.081 – probability Ricardo Ehlers Exploring Large Regression Model Spaces 15
UFSCar, April 2009 Models visited by GA−MCMC age lwt race smoke ptl ht ui ftv 2 4 6 9 14 22 33 50 84 Model Ricardo Ehlers Exploring Large Regression Model Spaces 16
UFSCar, April 2009 Example - Censored Survival Models Survival times of patients with primary biliary cirrhosis, h ( t ) = h 0 ( t ) exp( X ′ i θ ) . age: in years alb: serum albumin alkphos: alkaline phosphotase ascites: presence of ascites bili: serum bilirunbin edtrt: edema treatment hepmeg: enlarged liver platelet: platelet count protime: standardised blood clotting time sex: 1=male sgot: liver enzyme (now called AST) spiders: blood vessel malformations in the skin stage: histologic stage of disease (needs biopsy) trt: 1/2/-9 for control, treatment, not randomised copper: urine copper Ricardo Ehlers Exploring Large Regression Model Spaces 17
Recommend
More recommend