Mixtures of models Michel Bierlaire michel.bierlaire@epfl.ch - PowerPoint PPT Presentation

Mixtures of models Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Mixtures of models – p. 1/70

Mixtures In statistics, a mixture density is a pdf which is a convex linear combination of other pdf’s. If f ( ε, θ ) is a pdf, and if w ( θ ) is a nonnegative function such that � w ( θ ) dθ = 1 θ then � g ( ε ) = w ( θ ) f ( ε, θ ) dθ θ is also a pdf. We say that g is a mixture of f . If f is the pdf of a logit model, it is a mixture of logit If f is the pdf of a MEV model, it is a mixture of MEV Mixtures of models – p. 2/70

Mixtures Discrete mixtures are also possible. If f ( ε, θ ) is a pdf, and if w i , i = 1 , . . . , n are nonnegative weights such that n � w i = 1 i =1 associated with parameter values θ i , i = 1 , . . . , n then n � g ( ε ) = w i f ( ε, θ i ) i =1 is also a pdf. We say that g is a discrete mixture of f . Mixtures of models – p. 3/70

Mixtures Two important motivations: • Define more complex error terms • heteroscedasticity • correlation across alternatives • Capture taste heterogeneity Mixtures of models – p. 4/70

Capturing correlations Logit U in = V in + ε in where ε in iid EV Idea for the derivation of the nested logit model: U in = V in + ε m + ε in where ε m is the error term specific to nest m . Assumptions for the nested logit model: • ε m are independent across m • ε m + ε ′ m ∼ EV (0 , µ ) , and 1 i ∈ C m e µ m V i • ε ′ µ m ln � m = max i ∈ C m ( V i + ε im ) − Mixtures of models – p. 5/70

Capturing correlations • Assumptions are convenient for the derivation of the model • They are not natural or intuitive Consider a trinomial model, where alternatives 1 and 2 are correlated U 1 n = V 1 n + ε m + ε 1 n U 2 n = V 2 n + ε m + ε 2 n U 3 n = V 3 n + ε 3 n If ε in are iid EV and ε m is given, we have e V 1 n + ε m P n (1 | ε m , C n ) = e V 1 n + ε m + e V 2 n + ε m + e V 3 n Mixtures of models – p. 6/70

Capturing correlations But... ε m is not given! If we know its density function, we have � P n (1 |C n ) = P n (1 | ε m , C n ) f ( ε m ) dε m ε m This is a mixture of logit models In general, it is hopeless to obtain an analytical form for P n (1 |C n ) Simulation must be used. Mixtures of models – p. 7/70

Simulation: reminders Pseudo-random numbers generators Although deterministically generated, numbers exhibit the properties of random draws • Uniform distribution • Standard normal distribution • Transformation of standard normal • Inverse CDF • Multivariate normal Mixtures of models – p. 8/70

Simulation: uniform distribution • Almost all programming languages provide generators for a uniform U (0 , 1) • If r is a draw from a U (0 , 1) , then s = ( b − a ) r + a is a draw from a U ( a, b ) Mixtures of models – p. 9/70

Simulation: standard normal • If r 1 and r 2 are independent draws from U (0 , 1) , then s 1 = √− 2 ln r 1 sin(2 πr 2 ) s 2 = √− 2 ln r 1 cos(2 πr 2 ) are independent draws from N (0 , 1) Mixtures of models – p. 10/70

Simulation: transformations of standard normal • If r is a draw from N (0 , 1) , then s = br + a is a draw from N ( a, b 2 ) • If r is a draw from N ( a, b 2 ) , then e r is a draw from a lognormal LN ( a, b 2 ) with mean e a +( b 2 / 2) and variance e 2 a + b 2 ( e b 2 − 1) Mixtures of models – p. 11/70

Simulation: inverse CDF • Consider a univariate r.v. with CDF F ( ε ) • If F is invertible and if r is a draw from U (0 , 1) , then s = F − 1 ( r ) is a draw from the given r.v. • Example: EV with F ( ε ) = e − e − ε F − 1 ( r ) = − ln( − ln r ) Mixtures of models – p. 12/70

Simulation: multivariate normal • If r 1 ,. . . , r n are independent draws from N (0 , 1) , and   r 1 . . r =   .   r n • then s = a + Lr is a vector of draws from the n -variate normal N ( a, LL T ) , where • L is lower triangular, and • LL T is the Cholesky factorization of the variance-covariance matrix Mixtures of models – p. 13/70

Simulation: multivariate normal Example:   ℓ 11 0 0 L = ℓ 21 ℓ 22 0     ℓ 31 ℓ 32 ℓ 33 s 1 = ℓ 11 r 1 s 2 = ℓ 21 r 1 + ℓ 22 r 2 s 3 = ℓ 31 r 1 + ℓ 32 r 2 + ℓ 33 r 3 Mixtures of models – p. 14/70

Simulation for mixtures of logit • In order to approximate � P n (1 |C n ) = P n (1 | ε m , C n ) f ( ε m ) dε m ε m • Draw from f ( ε m ) to obtain r 1 , . . . , r R • Compute R = 1 � P n (1 |C n ) ≈ ˜ P n (1 |C n ) P n (1 | r k , C n ) R k =1 R e V 1 n + r k = 1 � e V 1 n + r k + e V 2 n + r k + e V 3 n R k =1 Mixtures of models – p. 15/70

Maximum simulated likelihood � J N � � � y jn ln ˜ max L ( θ ) = P n ( j | θ, C n ) θ n =1 j =1 where y jn = 1 if ind. n has chosen alt. j , 0 otherwise. Vector of parameters θ contains: • usual (fixed) parameters of the choice model • parameters of the density of the random parameters • For instance, if β j ∼ N ( µ j , σ 2 j ) , µ j and σ j are parameters to be estimated Mixtures of models – p. 16/70

Maximum simulated likelihood Properties of MSL: • If R is fixed, MSL is inconsistent • If R rises at any rate with N , MSL is consistent √ • If R rises faster than N , MSL is asymptotically equivalent to ML. Mixtures of models – p. 18/70

Modeling � P n (1 |C n ) = P n (1 | ε m , C n ) f ( ε m ) dε m ε m Mixtures of logit can be used to model, depending on the role of ε m in the kernel model. • Heteroscedasticity • Nesting structures • Taste variations • and many more... Mixtures of models – p. 19/70

Heteroscedasticity • Error terms in logit are i.i.d. and, in particular, homoscedastic U in = β T x in + ASC i + ε in • In order to introduce heteroscedasticity in the model, we use random ASCs ASC i ∼ N ( ASC i , σ 2 i ) so that U in = β T x in + ASC i + σ i ξ i + ε in where ξ i ∼ N (0 , 1) Mixtures of models – p. 20/70

Heteroscedasticity Identification issue: • Not all σ s are identified • One of them must be constrained to zero • Not necessarily the one associated with the ASC constrained to zero • In theory, the smallest σ must be constrained to zero • In practice, we don’t know a priori which one it is • Solution: 1. Estimate a model with a full set of σ s 2. Identify the smallest one and constrain it to zero. Mixtures of models – p. 21/70

Heteroscedastic model Example with Swissmetro ASC_CAR ASC_SBB ASC_SM B_COST B_FR B_TIME Car 1 0 0 cost 0 time Train 0 0 0 cost freq. time Swissmetro 0 0 1 cost freq. time Heteroscedastic model: ASCs random Mixtures of models – p. 22/70

Logit Hetero Hetero norm. L -5315.39 -5241.01 -5242.10 Value Scaled Value Scaled Value Scaled ASC CAR SP 0.189 1.000 0.248 1.000 0.241 1.000 ASC SM SP 0.451 2.384 0.903 3.637 0.882 3.657 B COST -0.011 -0.057 -0.018 -0.072 -0.018 -0.073 B FR -0.005 -0.028 -0.008 -0.031 -0.008 -0.032 B TIME -0.013 -0.067 -0.017 -0.069 -0.017 -0.071 SIGMA CAR SP 0.020 SIGMA SBB SP -0.039 -0.061 SIGMA SM SP -3.224 -3.180

Nesting structure • Structure of nested logit can be mimicked with error components • For each nest m , define a random term σ m ξ m where σ m ∈ R and ξ m ∼ N (0 , 1) . • σ m represents the standard error of the r.v. ξ m ∼ N (0 , 1) • If alternative i belongs to nest m , its utility writes U in = V in + σ m ξ m + ε in where ε in is, as usual, i.i.d EV. Mixtures of models – p. 23/70

Nesting structure Example: residential telephone ASC_BM ASC_SM ASC_LF ASC_EF BETA_C σ M σ F BM 1 0 0 0 ln (cost(BM)) 0 ξ M SM 0 1 0 0 ln (cost(SM)) ξ M 0 LF 0 0 1 0 ln (cost(LF)) 0 ξ F EF 0 0 0 1 ln (cost(EF)) 0 ξ F MF 0 0 0 0 ln (cost(MF)) 0 ξ F Mixtures of models – p. 24/70

Nesting structure Identification issues: • If there are two nests, only one σ is identified • If there are more than two nests, all σ ’s are identified Walker (2001) Results with 5000 draws... Mixtures of models – p. 25/70

NL MLogit MLogit MLogit MLogit σ F = 0 σ M = 0 σ F = σ M L -473.219 -472.768 -473.146 -472.779 -472.846 Value Scaled Value Scaled Value Scaled Value Scaled Value Scaled ASC BM -1.784 1.000 -3.81247 1.000 -3.79131 1.000 -3.80999 1.000 -3.81327 1.000 ASC EF -0.558 0.313 -1.19899 0.314 -1.18549 0.313 -1.19711 0.314 -1.19672 0.314 ASC LF -0.512 0.287 -1.09535 0.287 -1.08704 0.287 -1.0942 0.287 -1.0948 0.287 ASC SM -1.405 0.788 -3.01659 0.791 -2.9963 0.790 -3.01426 0.791 -3.0171 0.791 B LOGCOST -1.490 0.835 -3.25782 0.855 -3.24268 0.855 -3.2558 0.855 -3.25805 0.854 FLAT 2.292 MEAS 2.063 σ F -3.02027 0 -3.06144 -2.17138 σ M -0.52875 3.024833 0 -2.17138 σ 2 F + σ 2 9.402 9.150 9.372 9.430 M

Mixtures of models Michel Bierlaire michel.bierlaire@epfl.ch - PowerPoint PPT Presentation

Mixtures of models Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Mixtures of models p. 1/70 Mixtures In statistics, a mixture density is a pdf which is a convex linear combination of other pdfs. If f ( ,

Analysis of a model of elastic plastic mixtures (Prandtl-Reuss-mixtures) Project of Josef

Release granular mushrooms Release granular mushrooms and dried mixtures and dried mixtures

The science of mixtures and separation techniques Rahul Bhambure PhD Scientist, Chemical

Mixtures of Tree-Structured Probabilistic Graphical Models for Density Estimation in High

Simplifying mixtures of Parzen windows GRETSI 2011, Bordeaux, France Olivier Schwander Frank

Mixtures of Rasch Models with R Package psychomix Hannah Frick, Carolin Strobl, Friedrich Leisch,

Production of - tocopherol Rich Mixtures By Jessica John Sam VanGordon Justin Sneed

Crop mixtures Adrian Newton, David Guy, Christine Hackett, Bill Thomas, Roger Ellis, Stuart

Mixtures, Metals, Genes and Pathways: A Systematic Review Katherine von Stackelberg

k-Maximum Likelihood Estimator for mixtures of generalized Gaussians ICPR 2012, Tokyo, Japan

Rheology and Segregation Segregation of of Rheology and Granular Mixtures in Dense Flows

Modeling end-to-end internet delays using mixtures of Weibull distributions Iain W. Phillips and

MEMCARE (Metals and Metal Mixtures: Cognitive Aging, Remediation, and Exposure Sources) To

Mixtures of Weighted Distance-Based Models for Ranking Data Paul H. Lee Philip L. H. Yu The

Comprehensive tools and models for addressing exposure to mixtures during environmental

Discriminatively Trained Mixtures of Deformable Part Models Pedro Felzenszwalb and Ross Girshick

CS IL Presented by Paul Gauthier Individualized Funding Resource Centre S ociety and S pinal

CoMInDS: College Mathematics Instructor Development Source Supporting faculty who provide

Page 1 Behavior Change to Prevent Chronic Disease: DISCLOSURE Psychology in Action Neither the

PREVENTION 3 A few facts - Health care budgets are under scrutiny - Reimbursement structures

EMPLOYERS FIGHT BACK Vivian S. Lee, MD, PhD, MBA (AND HOW CLINICIANS CAN HELP) VivianLeeMD.com

Consumer Health and Wellness Segments Predicting Responsiveness to Emerging Healthcare

IT deployed in the FLORENCE D. HUDSON World of Smart Cities SVP and Chief Innovation Officer

Presentation on Far East Hospitality Trust February 2020 Important Notice Information contained