Probabilistic Choice Models James J. Heckman University of Chicago Econ 312, Spring 2019 Heckman Probabilistic Choice Models
• This chapter examines different models commonly used to model probabilistic choice, such as eg the choice of one type of transportation from among many choices available to the consumer. • Section 1 discusses derivation and limitations of conditional logit models. • Section 2 discusses probit models and Section 3 discusses the nested logit (generalized extreme value models), which address some of the limitations of the conditional logit models. Heckman Probabilistic Choice Models
The Conditional Logit Model Heckman Probabilistic Choice Models
• In this section we investigate conditional logit models. • We discuss its derivation from a random utility model with Extreme Value Type I distributed shocks. • The relevant properties of the Extreme Value Type I distribution are discussed. • We also derive the conditional logit model from the Luce axioms. • We discuss some of the limitations of the conditional logit models. Heckman Probabilistic Choice Models
The Extreme Value Type I Distribution Heckman Probabilistic Choice Models
• Suppose ε is independent (not necessarily identical) Extreme Value Type I random variable. • Then the CDF of ε is: Pr( ε < c ) = F ( c ) = exp ( − exp ( − ( c + α i ))) where α i is a parameter of the Extreme Value Type I CDF. • Also, by the assumption of independence, we can write: n n � � F ( ε 1 , ε 2 , · · · , ε n ) = F ( ε i ) = exp ( − exp ( − ( ε i + α i ))) i =1 i =1 Heckman Probabilistic Choice Models
• The Extreme Value Type I distribution has two useful features. • First, the difference between two Extreme Value Type I random variables is a logit. • Second, Extreme Value Type Is are closed under maximization, since (assuming independence): n � � � Pr max i { ε i } ≤ ε = Pr( ε i ≤ ε ) i =1 n � = exp ( − exp ( − ( ε + α i ))) i =1 � � n �� � = exp − exp ( − ( ε + α i )) i =1 � � n � = exp − exp ( − ε ) exp( − α i ) (1) i =1 Heckman Probabilistic Choice Models
n • Consider � exp( − α i ) . i =1 • We can solve for α in the following equation: n � exp( − α i ) = exp( − α ) i =1 which implies: � � n � − α = log exp( − α i ) . i =1 Heckman Probabilistic Choice Models
• We can then substitute this value of α into equation (1) to get: � � Pr max i { ε i } ≤ ε = exp ( − (exp ( − ε )) exp( − α )) = exp ( − exp ( − ( ε + α ))) which is indeed a Extreme Value Type I random variable. Heckman Probabilistic Choice Models
Random Utility Model • An individual with characteristics s has a choice set B ; with element x ⊆ B , B is a feasible set. • We write: Pr ( x | s , B ) as the probability that a person of characteristics s chooses x from the feasible set. Heckman Probabilistic Choice Models
• We also suppose that: U ( s , x ) = v ( s , x ) + ε ( s , x ) where ε is independent Extreme Value Type I. • From our information on Extreme Value Type Is in section 1, we know that ε i + v i , (and thus U i ), has an Extreme Value Type I distribution with parameter α i − v i , as shown below: F U i ( ε ) = Pr ( ε i + v i < ε ) = Pr( ε i < ε − v i ) = exp ( − exp ( − ( ε + α i − v i ))) Heckman Probabilistic Choice Models
• Let us now suppose that there are two goods and two corresponding utilities. • Consumers govern their choices by the obvious decision rule: choose good one if U 1 > U 2 . • More generally, if there are n goods, then good j will be selected if U j ∈ argmax { U i } n i =1 . Heckman Probabilistic Choice Models
• Specifically, in our two good case: Pr (1 is chosen) = Pr( U 1 > U 2 ) = Pr ( ε 1 + v 1 > ε 2 + v 2 ) Heckman Probabilistic Choice Models
• Imposing that ε is independent Extreme Value Type I, we can be much more precise about this probability: Pr ( ε 1 + v 1 > ε 2 + v 2 ) (2) = Pr ( ε 1 + v 1 − v 2 > ε 2 ) � ∞ �� ε 1 + v 1 − v 2 � = f ( ε 1 ) f ( ε 2 ) d ε 2 d ε 1 −∞ −∞ � ∞ = f ( ε 1 ) exp ( − exp − ( ε 1 + v 1 − v 2 + α 2 )) d ε 1 −∞ Heckman Probabilistic Choice Models
• Observe that F ( ε 1 ) = exp ( − exp − ( ε 1 + α 1 )) , which implies: ∂ F ( ε 1 ) f ( ε 1 ) = ∂ε 1 = exp (exp − ( ε 1 + α 1 )) (exp − ( ε 1 + α 1 )) = exp − ( ε 1 + α 1 ) (exp ( − exp − ( ε 1 + α 1 ))) Heckman Probabilistic Choice Models
• Substituting this into equation (2) gives us: ∞ � Pr (1 is chosen) = exp − ( ε 1 + α 1 ) (exp ( − exp − ( ε 1 + α 1 ))) −∞ exp ( − exp − ( ε 1 + v 1 − v 2 + α 2 )) d ε 1 ∞ � e − α 1 e − ε 1 � e [ − exp( − ε 1 )][exp( − α 1 ) − exp − ( v 1 − v 2 + α 2 )] d ε 1 � = −∞ Heckman Probabilistic Choice Models
� � 1 = exp ( − α 1 ) exp ( − α 1 ) + exp − ( v 1 − v 2 + α 2 ) e [ − exp( − ε 1 )][exp( − α 1 ) − exp − ( v 1 − v 2 + α 2 )] � ∞ � −∞ exp ( − α 1 ) = exp ( − α 1 ) + exp − ( v 1 − v 2 + α 2 ) exp( v 1 − α 1 ) = exp( v 1 − α 1 ) + exp( v 2 − α 2 ) Heckman Probabilistic Choice Models
• This result generalizes, because the max over ( n − 1) choices is still an Extreme Value Type I, so we can make a two stage maximization argument, as follows: Pr ( ε 1 + v 1 > ε i + v i , i = 1 , 2 , · · · , n ) � � = Pr ε 1 + v 1 > max i =2 , ··· , n ( ε i + v i ) exp( v 1 − α 1 ) = exp( v 1 − α 1 ) + exp( v 2 − α 2 ) + · · · + exp( v n − α n ) exp(˜ v 1 ) = n � exp (˜ v i ) i =1 where ˜ v j = v j − α j . Heckman Probabilistic Choice Models
• This type of model of probabilistic choice is called a conditional or multinomial logit model. • The difference between “conditional” and “multinomial” is simply that in the “conditional” logit case, the values of the variables (usually choice characteristics) vary across the choices, while the parameters are common across the choices. Heckman Probabilistic Choice Models
• In the “multinomial” logit case, the values of the variables are common across choices for the same person (usually individual characteristics) but the parameters vary across choices. Heckman Probabilistic Choice Models
• For e.g. we have in the linear v i case, the probability of individual j making choice i from among m choices is: exp( β ′ c ij ) Conditional Logit case: P ij = , where c ij is the m � exp( β ′ c kj ) k =1 vector of values of characteristics of choice i as perceived by individual j . exp( α ′ i s j ) Multinomial Logit case: P ij = , where s j is a m exp( α ′ � k s j ) k =1 vector of individual characteristics for individual j . Heckman Probabilistic Choice Models
• Note that we can easily combine the two cases under one model, as described below: Generalized case: We can combine the conditional and multinomial logit models by generalizing either one of the two types of models. For eg, we could permit the coefficients in the multinomial logit case to depend on choice characteristics, ie have: α i = φ i + c ′ ij θ Heckman Probabilistic Choice Models
• Then we get the generalized case, where the probability of choice i by individual j depends on both individual as well as choice characteristics (as well as interaction terms): exp( α ′ exp( φ ′ i s j + θ ′ c ij s j ) i s j ) P ij = = m m exp( α ′ exp( φ ′ � k s j ) � k s j + θ ′ c kj s j ) k =1 k =1 Heckman Probabilistic Choice Models
• We could similarly modify the coefficients in the conditional logit case to obtain the generalized version. Heckman Probabilistic Choice Models
Derivation of Logit from the Luce Axioms • We will now show how the conditional logit can be derived from the random utility model and the Luce Axioms presented below. Heckman Probabilistic Choice Models
Luce Axioms Axiom 1: Independence of Irrelevant Alternatives(IIA) Suppose that x , y ∈ B , s ∈ S . Then, Pr ( x | s , { x , y } ) Pr ( y | s , B ) = Pr ( y | s , { x , y } ) Pr ( x | s , B ) or, we have: Pr ( x | s , { x , y } ) Pr ( y | s , { x , y } ) = Pr ( x | s , B ) Pr ( y | s , B ) . Heckman Probabilistic Choice Models
• The term on the left is the odds ratio; the ratio of probabilities of choosing x to y given characteristics s and { x , y } . • This axiom has been named “Independence of Irrelevant Alternatives” for an obvious reason — the odds of our choice are not effected by adding additional alternatives. • Note that this assumes that the additional choices entering in B affect probability of choosing x in the same manner as they affect the probability of choosing y ; implicitly we are assuming that the additional choices have equivalent relationship with choice x and choice y . • We will see how this assumption is a limitation below. Heckman Probabilistic Choice Models
Axiom 2: Positivity This axiom states that the probability of choosing any one of the choices is strictly greater than zero: Pr ( y | s , B ) > 0 ∀ y ∈ B Heckman Probabilistic Choice Models
Recommend
More recommend