Mixture Models — Simulation-based Estimation Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Mixture Models — Simulation-based Estimation – p. 1/72
Outline • Mixtures • Capturing correlation • Alternative specific variance • Taste heterogeneity • Latent classes • Simulation-based estimation Mixture Models — Simulation-based Estimation – p. 2/72
Mixtures In statistics, a mixture probability distribution function is a convex combination of other probability distribution functions. If f ( ε, θ ) is a distribution function, and if w ( θ ) is a non negative function such that � w ( θ ) dθ = 1 θ then � g ( ε ) = w ( θ ) f ( ε, θ ) dθ θ is also a distribution function. We say that g is a w -mixture of f . If f is a logit model, g is a continuous w -mixture of logit If f is a MEV model, g is a continuous w -mixture of MEV Mixture Models — Simulation-based Estimation – p. 3/72
Mixtures Discrete mixtures are also possible. If w i , i = 1 , . . . , n are non negative weights such that n � w i = 1 i =1 then n � g ( ε ) = w i f ( ε, θ i ) i =1 is also a distribution function where θ i , i = 1 , . . . , n are parameters. We say that g is a discrete w -mixture of f . Mixture Models — Simulation-based Estimation – p. 4/72
Example: discrete mixture of normal distributions 2.5 N(5,0.16) N(8,1) 0.6 N(5,0.16) + 0.4 N(8,1) 2 1.5 1 0.5 0 4 5 6 7 8 9 10 11 Mixture Models — Simulation-based Estimation – p. 5/72
Example: discrete mixture of binary logit models 1 P(1|s=1,x) P(1|s=2,x) 0. 4 P(1|s=1,x) + 0.6 P(1|s=2,x) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -4 -2 0 2 4 Mixture Models — Simulation-based Estimation – p. 6/72
Mixtures • General motivation: generate flexible distributional forms • For discrete choice: • correlation across alternatives • alternative specific variances • taste heterogeneity • . . . Mixture Models — Simulation-based Estimation – p. 7/72
Back to the telephone example Budget measured: U BM = α BM + βX BM + ε BM Standard measured: U SM = α SM + βX SM + ε SM Local flat: U LF = α LF + βX LF + ε LF Extended area flat: U EF = α EF + βX EF + ε EF Metro area flat: U MF = βX MF + ε MF Distributions for ε : logit, probit, nested logit Mixture Models — Simulation-based Estimation – p. 8/72
Back to the telephone example Covariance of U Logit Probit σ 2 σ 2 0 0 0 0 σ BM,SM σ BM,LF σ BM,EF σ BM,MF BM σ 2 σ 2 0 0 0 0 σ BM,SM σ SM,LF σ SM,EF σ SM,MF SM σ 2 σ 2 0 0 0 0 σ BM,LF σ SM,LF σ LF σ LF ,EF ,MF LF σ 2 σ 2 σ BM,EF σ SM,EF σ LF σ EF 0 0 0 0 ,EF ,MF EF σ 2 σ 2 σ BM,MF σ SM,MF σ LF σ EF 0 0 0 0 ,MF ,MF MF Nested logit 1 ρ M 0 0 0 ρ M 1 0 0 0 , ρ i = 1 − µ 2 π 2 ρ F ρ F 0 0 1 6 µ 2 µ 2 i ρ F ρ F 0 0 1 0 0 ρ F ρ F 1 Mixture Models — Simulation-based Estimation – p. 9/72
Continuous Mixtures of logit • Combining probit and logit • Error decomposed into two parts U in = V in + ξ + ν i.i.d EV (logit): tractability Normal distribution (probit): flexibility Mixture Models — Simulation-based Estimation – p. 10/72
Logit • Utility: U auto = βX auto + ν auto U bus = βX bus + ν bus U subway = βX subway + ν subway • ν i.i.d. extreme value • Probability: e βX auto Pr( auto | X ) = e βX auto + e βX bus + e βX subway Mixture Models — Simulation-based Estimation – p. 11/72
Normal mixture of logit • Utility: U auto = βX auto + ξ auto + ν auto U bus = βX bus + ξ bus + ν bus U subway = βX subway + ξ subway + ν subway • ν i.i.d. extreme value, ξ ∼ N (0 , Σ) • Probability: e βX auto + ξ auto Pr( auto | X, ξ ) = e βX auto + ξ auto + e βX bus + ξ bus + e βX subway + ξ subway � P ( auto | X ) = Pr( auto | X, ξ ) f ( ξ ) dξ ξ Mixture Models — Simulation-based Estimation – p. 12/72
Simulation � P ( auto | X ) = Pr( auto | X, ξ ) f ( ξ ) dξ ξ • Integral has no closed form. • Monte Carlo simulation must be used. Mixture Models — Simulation-based Estimation – p. 13/72
Simulation • In order to approximate � P ( i | X ) = Pr( i | X, ξ ) f ( ξ ) dξ ξ • Draw from f ( ξ ) to obtain r 1 , . . . , r R • Compute R = 1 P ( i | X ) ≈ ˜ � P ( i | X ) P ( i | X, r k ) R k =1 R e V 1 n + r k = 1 � e V 1 n + r k + e V 2 n + r k + e V 3 n R k =1 Mixture Models — Simulation-based Estimation – p. 14/72
Capturing correlations: nesting • Utility: U auto = βX auto + ν auto U bus = βX bus + σ transit η transit + ν bus U subway = βX subway + σ transit η transit + ν subway • ν i.i.d. extreme value, η transit ∼ N (0 , 1) , σ 2 transit = cov(bus,subway) • Probability: e βX auto Pr( auto | X, η transit ) = e βX auto + e βX bus + σ transit η transit + e βX subway + σ transit η transit � P ( auto | X ) = Pr( auto | X, η ) f ( η ) dη η Mixture Models — Simulation-based Estimation – p. 15/72
Nesting structure Example: residential telephone ASC_BM ASC_SM ASC_LF ASC_EF BETA_C σ M σ F BM 1 0 0 0 ln (cost(BM)) 0 η M SM 0 1 0 0 ln (cost(SM)) η M 0 LF 0 0 1 0 ln (cost(LF)) 0 η F EF 0 0 0 1 ln (cost(EF)) 0 η F MF 0 0 0 0 ln (cost(MF)) 0 η F Mixture Models — Simulation-based Estimation – p. 16/72
Nesting structure Identification issues: • If there are two nests, only one σ is identified • If there are more than two nests, all σ ’s are identified Walker (2001) Results with 5000 draws.. Mixture Models — Simulation-based Estimation – p. 17/72
NL NML NML NML NML σ F = 0 σ M = 0 σ F = σ M L -473.219 -472.768 -473.146 -472.779 -472.846 Value Scaled Value Scaled Value Scaled Value Scaled Value Scaled ASC BM -1.784 1.000 -3.81247 1.000 -3.79131 1.000 -3.80999 1.000 -3.81327 1.000 ASC EF -0.558 0.313 -1.19899 0.314 -1.18549 0.313 -1.19711 0.314 -1.19672 0.314 ASC LF -0.512 0.287 -1.09535 0.287 -1.08704 0.287 -1.0942 0.287 -1.0948 0.287 ASC SM -1.405 0.788 -3.01659 0.791 -2.9963 0.790 -3.01426 0.791 -3.0171 0.791 B LOGCOST -1.490 0.835 -3.25782 0.855 -3.24268 0.855 -3.2558 0.855 -3.25805 0.854 FLAT 2.292 MEAS 2.063 σ F 3.02027 0 3.06144 2.17138 σ M 0.52875 3.024833 0 2.17138 σ 2 F + σ 2 9.402 9.150 9.372 9.430 M
Comments • The scale of the parameters is different between NL and the mixture model • Normalization can be performed in several ways • σ F = 0 • σ M = 0 • σ F = σ M • Final log likelihood should be the same • But... estimation relies on simulation • Only an approximation of the log likelihood is available • Final log likelihood with 50000 draws: Unnormalized: -472.872 σ M = σ F : -472.875 σ F = 0 : -472.884 σ M = 0 : -472.901 Mixture Models — Simulation-based Estimation – p. 18/72
Cross nesting ⑤ � ❅ � ❅ � ❅ � ❅ � ❅ P ⑤ ⑤ Nest 1 P Nest 2 P � ❅ � ❅ � ❅ P � ❅ P P � ❅ P � ❅ ⑤ ⑤ ⑤ ⑤ ⑤ Bus Train Car Ped. Bike U bus = V bus + ξ 1 + ε bus U train = V train + ξ 1 + ε train U car = V car + ξ 1 + ξ 2 + ε car U ped = V ped + ξ 2 + ε ped U bike = V bike + ξ 2 + ε bike � � P ( car ) = P ( car | ξ 1 , ξ 2 ) f ( ξ 1 ) f ( ξ 2 ) dξ 2 dξ 1 ξ 1 ξ 2 Mixture Models — Simulation-based Estimation – p. 19/72
Identification issue • Not all parameters can be identified • For logit, one ASC has to be constrained to zero • Identification of NML is important and tricky • See Walker, Ben-Akiva & Bolduc (2007) for a detailed analysis Mixture Models — Simulation-based Estimation – p. 20/72
Alternative specific variance • Error terms in logit are i.i.d. and, in particular, have the same variance U in = β T x in + ASC i + ε in • ε in i.i.d. extreme value ⇒ Var ( ε in ) = π 2 / 6 µ 2 • In order allow for different variances, we use mixtures U in = β T x in + ASC i + σ i ξ i + ε in where ξ i ∼ N (0 , 1) • Variance: i + π 2 Var ( σ i ξ i + ε in ) = σ 2 6 µ 2 Mixture Models — Simulation-based Estimation – p. 21/72
Alternative specific variance Identification issue: • Not all σ s are identified • One of them must be constrained to zero • Not necessarily the one associated with the ASC constrained to zero • In theory, the smallest σ must be constrained to zero • In practice, we don’t know a priori which one it is • Solution: 1. Estimate a model with a full set of σ s 2. Identify the smallest one and constrain it to zero. Mixture Models — Simulation-based Estimation – p. 22/72
Alternative specific variance Example with Swissmetro ASC_CAR ASC_SBB ASC_SM B_COST B_FR B_TIME Car 1 0 0 cost 0 time Train 0 0 0 cost freq. time Swissmetro 0 0 1 cost freq. time + alternative specific variance Mixture Models — Simulation-based Estimation – p. 23/72
Recommend
More recommend