Discrete Panel Data Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Discrete Panel Data – p. 1/34
Outline • Introduction • Static model • Static model with panel effect • Dynamic model • Dynamic model with panel effect • Application Discrete Panel Data – p. 2/34
Introduction • Type of data used so far: cross-sectional . • Cross-sectional: observation of individuals at the same point in time. • Time series: sequence of observations. • Panel data is a combination of comparable time series. Discrete Panel Data – p. 3/34
Introduction • Panel Data: data collected over multiple time periods for the same sample of individuals. • Multidimensional: Individual Day Price of stock 1 Price of stock 2 Purchase n t x 1 nt x 2 nt i int 1 1 12.3 15.6 1 1 2 12.1 18.6 2 1 3 11.0 25.3 2 1 4 9.2 25.1 0 2 1 12.3 15.6 2 2 2 12.1 18.6 0 2 3 11.0 25.3 0 2 4 9.2 25.1 1 Discrete Panel Data – p. 4/34
Introduction Examples of discrete panel data: • People are interviewed monthly and asked if they are working or unemployed. • Firms are tracked yearly to determine if they have been acquired or merged. • Consumers are interviewed yearly and asked if they have acquired a new cell phone. • Individual’s health records are reviewed annually to determine onset of new health problems. Discrete Panel Data – p. 5/34
Model: single time period x ε U i Discrete Panel Data – p. 6/34
Static model x t − 1 x t ε t − 1 ε t U t − 1 U t i t − 1 i t Discrete Panel Data – p. 7/34
Static model The model: • Utility: U int = V int + ε int , i ∈ C nt . • Logit: e V int P ( i nt ) = � j ∈C nt e V jnt • Estimation: contribution of individual n to the log likelihood: T � P ( i n 1 , i n 2 , . . . , i nT ) = P ( i n 1 ) P ( i n 2 ) · · · P ( i nT ) = P ( i nt ) t =1 T � ln P ( i n 1 , i n 2 , . . . , i nT ) = ln P ( i n 1 )+ln P ( i n 2 )+ · · · +ln P ( i nT ) = ln P ( i nt ) t =1 Discrete Panel Data – p. 8/34
Static model: comments • Views observations collected through time as supplementary cross sectional observations. • Standard software for cross section discrete choice modeling may be used directly. • Simple, but there are two important limitations: 1. Serial correlation: • unobserved factor persist over time, • in particular, all factors related to individual n , • ε in ( t − 1) cannot be assumed independent from ε int . 2. Dynamics: • Choice in one period may depend on choices made in the past. • e.g. learning effect, habits. Discrete Panel Data – p. 9/34
Dealing with serial correlation x t − 1 x t ε t − 1 ε t U t − 1 U t i t − 1 i t Discrete Panel Data – p. 10/34
Panel effect • Relax the assumption that ε int are independent across t . • Assumption about the source of the correlation: • individual related unobserved factors, • persistent over time. • The model: ε int = α in + ε ′ int • It is also known as • agent effect, • unobserved heterogeneity. Discrete Panel Data – p. 11/34
Panel effect • Assuming that ε ′ int are independent across t , • we can apply the static model. • Two versions of the model: • with fixed effect: α in are unknown parameters to be estimated, • with random effect: α in are distributed. Discrete Panel Data – p. 12/34
Static model with fixed effect The model: • Utility: U int = V int + α in + ε ′ int , i ∈ C nt . • Logit: e V int + α in P ( i nt ) = � j ∈C nt e V jnt + α jn • Estimation: contribution of individual n to the log likelihood: T � P ( i n 1 , i n 2 , . . . , i nT ) = P ( i n 1 ) P ( i n 2 ) · · · P ( i nT ) = P ( i nt ) t =1 T � ln P ( i n 1 , i n 2 , . . . , i nT ) = ln P ( i n 1 )+ln P ( i n 2 )+ · · · +ln P ( i nT ) = ln P ( i nt ) t =1 Discrete Panel Data – p. 13/34
Static model with fixed effect Comments: • α in capture permanent taste heterogeneity. • For each n , one α in must be normalized to 0. • The α ’s are estimated consistently only if T → ∞ . • This has an effect on the other parameters that will be inconsistently estimated. • In practice, • T is usually too short, • the number of α parameters is usually too high, for the model to be consistently estimated and practical. Discrete Panel Data – p. 14/34
Static model with random effect • Denote α n the vector gathering all parameters α in . • Assumption: α n is distributed with density f ( α n ) . • For instance: α n ∼ N (0 , Σ) . • We have a mixture of static models. • Given α n , the model is static, as ε ′ int are assumed independent across t . Discrete Panel Data – p. 15/34
Static model with random effect The model: • Utility: U int = V int + α in + ε ′ int , i ∈ C nt . • Conditional choice probability: e V int + α in P ( i nt | α n ) = � j ∈C nt e V jnt + α jn Discrete Panel Data – p. 16/34
Static model with random effect Estimation: • Contribution of individual n to the log likelihood, given α n T � P ( i n 1 , i n 2 , . . . , i nT | α n ) = P ( i nt | α n ) . t =1 • Unconditional choice probability: T � � P ( i n 1 , i n 2 , . . . , i nT ) = P ( i nt | α ) f ( α ) dα. α t =1 Discrete Panel Data – p. 17/34
Static model with random effect Estimation: • Mixture model. • Requires simulation for large choice sets. • Generate draws α 1 , . . . , α R from f ( α ) . • Approximate T R T � P ( i nt | α ) f ( α ) dα ≈ 1 � � � P ( i nt | α r ) P ( i n 1 , i n 2 , . . . , i nT ) = R α t =1 r =1 t =1 • The product of probabilities can generate very small numbers. � T � R T R � � � � P ( i nt | α r ) = ln P ( i nt | α r ) exp . r =1 t =1 r =1 t =1 Discrete Panel Data – p. 18/34
Static model with random effect Comments: • Parameters to be estimated: β ’s and σ ’s • Maximum likelihood estimation leads to consistent and efficient estimators. • Ignoring the correlation (i.e. assuming that α n is not present) leads to consistent but not efficient estimators (not the true likelihood function). • Accounting for serial correlation generates the true likelihood function and, therefore, the estimates are consistent and efficient. Discrete Panel Data – p. 19/34
Dynamics • Choice in one period may depend on choices made in the past • e.g. learning effect, habits. • Simplifying assumption: • the utility of an alternative at time t • is influenced by the choice made at time t − 1 only. • It leads to a dynamic Markov model. Discrete Panel Data – p. 20/34
Dynamic Markov model x t − 1 x t ε t − 1 ε t U t − 1 U t i t − 1 i t Discrete Panel Data – p. 21/34
Dynamic Markov model The model: U int = V int + γy in ( t − 1) + ε int , i ∈ C nt . � 1 if alternative i was chosen by n at time t − 1 y in ( t − 1) = otherwise . 0 • Captures serial dependence on past realized state • Example - utility of bus today depends on whether consumer took bus yesterday (habit). • Fails if utility of bus today depends on permanent individual taste for bus (tastes) and whether consumer took bus yesterday. No serial correlation. • Estimation: same as for the static model, except that observation t = 0 is lost. Discrete Panel Data – p. 22/34
Dynamic Markov model with serial correlation x t − 1 x t ε t − 1 ε t U t − 1 U t i t − 1 i t Discrete Panel Data – p. 23/34
Dynamic Markov model • Extension: combine Markov with panel effect. U int = V int + α in + γy in ( t − 1) + ε ′ int , i ∈ C nt . • Dynamic Markov model with fixed effect. • Similar to the static model with FE. • Similar limitations. • Dynamic Markov model with random effect. • Difficulties depending on how the Markov chain starts. • If the first choice i 0 is truly exogenous → similar to the static model with RE. Discrete Panel Data – p. 24/34
Dynamic Markov model What if i n 0 is not exogenous (i.e. stochastic)? U in 1 = V in 1 + α in + γy in 0 + ε ′ in 1 , i ∈ C n 1 . • The first choice i n 0 is dependent on the agent’s effect α in . • So, the explanatory variable y in 0 is correlated with α in . • This is called endogeneity . • Solution: use the Wooldridge approach. Discrete Panel Data – p. 25/34
Dynamic Markov model with RE - Wooldridge • Conditional on y in 0 , we have a dynamic Markov model with RE as before. U int = V int + α in + γy in ( t − 1) + ε ′ int , i ∈ C nt . • Contribution of individual n to the log likelihood, given i n 0 and α n T � P ( i n 1 , i n 2 , . . . , i nT | i n 0 , α n ) = P ( i nt | i n 0 , α n ) . t =1 • We integrate out α n : T � � P ( i n 1 , i n 2 , . . . , i nT | i n 0 ) = P ( i nt | i n 0 , α ) f ( α | i n 0 ) dα. α t =1 Discrete Panel Data – p. 26/34
Dynamic Markov model with RE - Wooldridge • The main difference between static model with RE and dynamic model with RE is the term f ( α | i n 0 ) • It captures the distribution of the panel effects, knowing the first choice. • This can be approximated by, for instance, α n = a + by n 0 + cx n + ξ n , ξ n ∼ N (0 , Σ α ) . • a , b and c are vectors and Σ α a matrix of parameters to be estimated. • x n capture the entire history ( t = 1 , . . . , T ) for agent n . • This addresses the endogeneity issue. Discrete Panel Data – p. 27/34
Recommend
More recommend