CCP Estimation of Dynamic Discrete Choice Models With Unobserved Heterogeneity Yitian (Sky) LIANG Department of Marketing Sauder School of Business March 7, 2013
Roadmap ◮ Summary of the paper (5 mins) ◮ Motivating example: bus engine replacement model (Rust, 1987) (10 mins) ◮ Estimator and algorithm (10 mins) ◮ Application result in the motivating example (5 mins)
Summary ◮ Motivation: unobserved heterogeneity (unobserved correlated state variables) ◮ Can’t have consistent first-stage estimates of CCP ◮ Violation of CI ◮ Develop a modified EM algorithm to estimate the structural parameters and the distribution of unobserved state variables ◮ Develop the concept of “finite dependence” (will not covered) ◮ identification? ◮ facilitate estimation?
Motivating Example (Setup): Our Friend - Harold Zurcher ◮ Infinite horizon (later in the application, they set it to be finite horizon) ◮ Choice space { d 1 t , d 2 t } , i.e. replace the engine v.s keep it. ◮ State space { x t , s , ǫ t } , i.e. accumulated mileage since the last replacement, brand of the bus and transitory shocks (not observed by the econometrician) ◮ Controlled transition rule: ◮ x t + 1 = x t + 1 if d 2 t = 1. ◮ x t + 1 = 0 if d 1 t = 1. ◮ Per-period payoff: u ( d 1 t , x t , s ) = d 1 t · ǫ 1 t + ( 1 − d 1 t ) · ( θ 0 + θ 1 x t + θ 2 s + ǫ 2 t ) .
Harold Zurcher Cont. ◮ Hotz and Miller (1993): difference between conditional value function can be represented by flow payoff and CCP, i.e. v 2 ( x , s ) − v ( x 1 , s ) = θ 0 + θ 1 x + θ 2 s + β log [ p 1 ( 0 , s )] − β log [ p 1 ( x + 1 , s )] . ◮ Then we have: p 1 ( x , s ) = 1 1 + exp [ v 2 ( x , s ) − v ( x 1 , s )] . ◮ Let π s be the probability a bus is brand s .
Harold Zurcher Cont. (Suppose know ˆ p ) ◮ MLE, � � ˆ θ, ˆ = argmax θ,π � n log [ � s π s Π t l ( d nt | x nt , s , ˆ p 1 , θ )] . π ◮ EM Algorithm ◮ Expectation step: � � ◮ ˆ s n = s | d n , x n ; ˆ q ns = Pr θ, ˆ π, ˆ p 1 = π s Π t l ( d nt | x nt , s , ˆ ˆ p 1 , θ ) � s ′ ˆ π s ′ Π t l ( d nt | x nt , s ′ , ˆ p 1 , θ ) � N ◮ ˆ π s = 1 n = 1 ˆ q ns . N ◮ Maximization step: ˆ θ = argmax θ � n log [ � s ˆ π s Π t l ( d nt | x nt , s , ˆ p 1 , θ )] .
Harold Zurcher Cont. (Update ˆ p ) ◮ Two ways to update CCP: model-based v.s non-model-based ◮ Non model based update of CCP p 1 ( x , s ) = Pr { d 1 nt = 1 | s n = s , x nt = x } E [ d 1 nt q ns | x nt = x ] = E [ q ns | x nt = x ] ◮ Sample analogue: � � t d 1 nt ˆ q ns I ( x nt = x ) n p 1 ( x , s ) = ˆ � � t ˆ q ns I ( x nt = x ) n ◮ Model based update: � , θ ( m ) � p ( m + 1 ) d nt | x nt , s , p ( m ) ( x nt , s ) = l . 1 1
General Model ◮ Larger choice space, non-stationarity (i.e. finite horizon) ◮ Unobserved heterogeneity changes over time: need to estimate its transition π ( s t + 1 | s t ) . ◮ Initial value problem: need to estimate π ( s 1 | x 1 ) . ◮ Sketch of the algorithm ◮ Expectation step: sequential update q ns → π ( s 1 | x 1 ) , π ( s t + 1 | s t ) → p jt ( x , s ) . ◮ Maximization step: maximize the conditional likelihood w.r.t structural parameters.
General Model - Likelihood � � � L ( d n , x n | x n 1 ; θ, π, p ) = · · · [ π ( s 1 | x n 1 ) L 1 ( d n 1 , x n 2 | x n 1 , s 1 ; θ, π, p ) s 1 s 2 s T � � � Π T × π ( s t | s t − 1 ) L t ( d nt , x n , t + 1 | x nt , s t ; θ, π, p ) . t = 2 where L t ( d nt , x n , t + 1 | x nt , s t ; θ, π, p ) j = 1 [ l jt ( x nt , s nt , θ, π, p ) f jt ( x n , t + 1 | x nt , s nt , θ )] d jnt . Π J =
The Algorithm - Expectation Step Update q ( m ) nst : = L ( m ) ( s nt = s ) q ( m + 1 ) n , nst L ( m ) n where L nt ( s nt = s ) � � � � Π t − 1 � t ′ = 2 π ( s t ′ | s t ′ − 1 ) L nt ′ ( s t ′ ) � = · · · · · · π ( s 1 | x n 1 ) L n 1 ( s 1 ) s 1 s t − 1 s t + 1 s T Π T � t ′ = t + 2 π ( s t ′ | s t ′ − 1 ) L nt ′ ( s t ′ ) � × π ( s t | s t − 1 ) L nt ( s ) π ( s t + 1 | s ) L n , t + 1 ( s t + 1 )
The Algorithm - Expectation Step Cont. Update π ( m ) ( s | x ) : n = 1 q ( m + 1 ) � N I ( x n 1 = x ) π ( m + 1 ) ( s | x ) = ns 1 . � N n = 1 I ( x n 1 = x ) Update π ( m + 1 ) ( s ′ | s ) : t = 2 q ( m + 1 ) ns ′ t | s q ( m + 1 ) � N � T n = 1 ns , t − 1 π ( m + 1 ) � s ′ | s � = , t = 2 q ( m + 1 ) � N � T n = 1 ns , t − 1 where the definition of q ( m + 1 ) ns ′ t | s is on page 1847.
The Algorithm - Expecation Step Cont. & Maximization Step Update p ( m + 1 ) ( x , s ) : jt n = 1 d njt q ( m + 1 ) � N I ( x nt = x ) p ( m + 1 ) nst ( x , s ) = . jt n = 1 q ( m + 1 ) � N I ( x nt = x ) nst Maximization step: θ ( m + 1 ) = argmax θ q ( m + 1 ) � d nt , x n , t + 1 | x nt , s nt = s ; θ, π ( m + 1 ) , p ( m + 1 ) � � � � � log L t . nst n t s j
Alternative Algorithm - Two Stage Estimator ◮ Stage 1: recover θ 1 , π ( s 1 | x 1 ) , π ( s ′ | s ) , p jt ( x t , s t ) by using the EM algorithm. ◮ Stage 2: recover θ 2 . ◮ Key idea: non-parametric representation of the likelihood (free of structural parameters): L t ( d nt , x n , t + 1 | x nt , s nt ; θ 1 , π, p ) Π J j = 1 [ l jt ( x nt , s nt , θ, π, p ) f jt ( x n , t + 1 | x nt , s nt , θ 1 )] d jnt = j = 1 [ p jt ( x nt , s nt ) f jt ( x n , t + 1 | x nt , s nt , θ 1 )] d jnt . Π J =
Alternative Algorithm - Two Stage Estimator Cont. ◮ Stage 1 expectation step: update q and π ◮ Stage 1 maximization step: maximize the conditional likelihood w.r.t p and θ 1 ◮ Stage 2: given stage 1 estimates, can apply any CCP based method to recover θ 2 , i.e. Hotz and Miller (1993), BBL (2007).
Back to Harold Zurcher
Recommend
More recommend