Classical Discrete Choice Theory James J. Heckman University of - PowerPoint PPT Presentation

Debreu (1960) criticism of Luce Model • “Red Bus - Blue Bus Problem” • Suppose N + 1 th alternative is identical to the first 2 e θ ( s ) ′ x N +1 Pr(choose 1 or N + 1 | s , B ′ ) = � N +1 l =1 e θ ( s ) ′ x l • = ⇒ Introduction of identical good changes probability of riding a bus. • not an attractive result • comes from need to make iid assumption on new alternative Heckman Classical Discrete Choice Theory

Debreu (1960) criticism of Luce Model: Some Alternative Assumptions 1 Could let v i = ln( θ ( s ) ′ x i ) θ ( s ) ′ x j Pr( j | s , B ) = � N +1 l =1 θ ( s ) ′ x l If we also imposed � N l =1 θ ( s ) ′ x l = 1, we would get linear probability model but this could violate IIA. 2 Could consider model of form e θ j ( s ) x i Pr( j | s , B ) = � N l =1 e θ l ( s ) x l but here we have lost our forecasting ability (cannot predict demand for a new good). 3 Universal Logit Model e ϕ i ( x 1 ,..., x N ) β ( s ) Pr( i | s , x 1 , ..., x N ) = � N l =1 e ϕ l ( x 1 ,..., x N ) β ( s ) Heckman Classical Discrete Choice Theory Here we lose IIA and forecasting (Bernstein Polynomial

Criteria for a good PCS 1 Goal: We want a probabilistic choice model that 1 has a flexible functional form 2 is computationally practical 3 allows for flexibility in representing substitution patterns among choices 4 is consistent with a random utility model (RUM) = ⇒ has a structural interpretation Heckman Classical Discrete Choice Theory

How do you verify that a candidate PCS is consistent with a RUM? 1 Goal: (a) Either start with a R.U.M. u i = v ( s , x i ) + ε ( s , x i ) and solve integral for � v l + ε l � Pr( u i > u l , ∀ l � = i ) = Pr( i = arg max ) l or (b) start with a candidate PCS and verify that it is consistent with a R.U.M. (easier) 2 McFadden provides sufficient conditions 3 See discussion of Daley-Zachary-Williams theorem Heckman Classical Discrete Choice Theory

Link to Airum Models Heckman Classical Discrete Choice Theory

Daly-Zachary-Williams Theorem • Daly-Zachary (1976) and Williams (1977) provide a set of conditions that makes it easy to derive a PCS from a RUM with a class of models (“generalized extreme value” (GEV) models) • Define G : G ( Y 1 , . . . , Y J ) • If G satisfies the following 1 nonnegative defined on Y 1 , . . . , Y J ≥ 0 2 homogeneous degree one in its arguments Y i →∞ G ( Y 1 , . . . , Y i , . . . , Y J ) → ∞ , ∀ i = 1 , . . . , J lim 3 • ∂ k G is nonnegative if k odd (1) nonpositive if even ∂ Y 1 · · · ∂ Y k Heckman Classical Discrete Choice Theory

• Then for a R.U.M. with u i = v i + ε i and � � e − ε 1 , . . . , e − ε J �� F ( ε 1 , . . . , ε J ) = exp − G • This cdf has Weibull marginals but allows for more dependence among ε ’s. • The PCS is given by = e v i G i ( e v 1 , . . . , e v J ) P i = ∂ ln G ∂ v i G ( e v 1 , . . . , e v J ) • Note: McFadden shows that under certain conditions on the form of the indirect utility function (satisfies AIRUM form), the DZW result can be seen as a form of Roy’s identity. Heckman Classical Discrete Choice Theory

• Let’s apply this result • Multinomial logit model (MNL) e − e − ε 1 · · · e − e − ε J ← cdf F ( ε 1 , . . . , ε J ) = − product of iid Weibulls e − � J j =1 e − ε j = • Can verify that G ( e v 1 , . . . , e v J ) = � J j =1 e v i satisfies DZW conditions e v j P ( j ) = ∂ ln G = l =1 e v l = MNL model � J ∂ v i Heckman Classical Discrete Choice Theory

• Another GEV model • Nested logit model (addresses to a limited extent the IIA criticism) • Let   1 − σ m ( տ like an M �  � vi G ( e v 1 , . . . , e v J ) =  elasticity a m e 1 − σ m of substitution) m =1 i ∈ B m Heckman Classical Discrete Choice Theory

• Idea: divide goods into branches • First choose branch, then good within branch red blue car bus • Will allow for correlation between errors (this is role of σ )) • B m ⊆ { 1 , . . . , J } � B m = B m =1 is a single branch—need not have all choices on all branches Heckman Classical Discrete Choice Theory

• Note: if σ = 0, get usual MNL form • Calculate equation for � � 1 − σ m � �� m vi ∂ ln m =1 a m i ∈ B m e 1 − σ m ∂ ln G p i = = ∂ v i ∂ v i � � �� − σ m �� − 1 �� vi vi vi vi m ∋ i ∈ B m a m e 1 − σ m i ∈ B m e 1 − σ m i ∈ B m e 1 − σ m i ∈ B m e 1 − σ m = �� 1 − σ m � m vi m =1 a m i ∈ B m e 1 − σ m m � = P ( i | B m ) P ( B m ) m =1 Heckman Classical Discrete Choice Theory

• Where vi e 1 − σ m P ( i | B m ) = if i ∈ B m , 0 otherwise � vi i ∈ B m e 1 − σ m �� 1 − σ m vi a m i ∈ B m e 1 − σ m P ( B m ) = �� 1 − σ m � m vi m =1 a m i ∈ B m e 1 − σ m • Note: If P ( B m ) = 1 get logit form • Nested logit requires that analyst make choices about nesting structure Heckman Classical Discrete Choice Theory

• How does nested logit solve red bus/blue bus problem? • Suppose � � 1 − σ 1 1 Y i = e v i 1 − σ 1 − σ G = Y 1 + Y + Y 2 3 Heckman Classical Discrete Choice Theory

e v 1 ∂ ln G P (1 | { 123 } ) = = � � 1 − σ ∂ v i v 2 v 3 e v 1 + 1 − σ + e e 1 − σ � � − σ v 2 v 2 v 3 1 − σ + e e e 1 − σ 1 − σ ∂ ln G P (2 | { 123 } ) = = � � 1 − σ ∂ v i v 2 v 3 e v 1 + 1 − σ + e e 1 − σ Heckman Classical Discrete Choice Theory

• As v 3 → −∞ e v 1 P (1 | { 123 } ) = (get logistic) e v 1 + e v 2 • As v 1 → −∞ Heckman Classical Discrete Choice Theory

What Role Does σ Play? • σ is the degree of substitutability parameter • Recall F ( ε 1 , ε 2 , ε 3 ) = exp {− G ( e − ε 1 , e − ε 2 , e − ε 3 ) } • Here cov ( ε 2 , ε 3 ) σ = √ var ε 2 var ε 3 = correlation coefficient • Thus we require − 1 ≤ σ ≤ 1, but turns out we also need to require σ > 0 for DZW conditions to be satisfied. This is unfortunate because it does not allow ε ’s to be negatively correlated. • Can show that e v 1 σ → 1 P (1 | { 123 } ) = lim e v 1 + max( e v 2 , e v 3 ) (L’Hˆ opital’s Rule) Heckman Classical Discrete Choice Theory

• If v 2 = v 3 , then � � − σ v 2 v 2 e 2 e 1 − σ 1 − σ P (2 | { 123 } ) = � � 1 − σ v 2 e v 1 + 2 e 1 − σ e v 2 2 − σ = e v 1 + ( e v 2 ) (2 1 − σ ) e v 2 2 − 1 lim = e v 1 + e v 2 when v 1 = v 2 σ → 1 ր introduce 3rd identical alternative and cut the probability of choosing 2 in half • Solves red-bus/blue-bus problem • Probability cut in half with two identical alternatives Heckman Classical Discrete Choice Theory

red bus blue bus car • σ is a measure of similarity between red and blue bus. • When σ close to one, the conditional choice probability selects with high probability the alternative. Heckman Classical Discrete Choice Theory

• Remark: We can expand logit to accommodate multiple levels ex.  � 1 − σ m  ��   Q � � 1 1 − σ m G = a q a m y  3 levels i  q =1 m ∈ Q q i ∈ B m Heckman Classical Discrete Choice Theory

• Example: Two Choices 1 Neighborhood ( m ) 2 Transportation mode ( t ) 3 P ( m ): choice of neighborhood 4 P ( i | B m ): probability of choosing i th mode, given neighborhood m Heckman Classical Discrete Choice Theory

1 Not all modes available in all neighborhoods �� T m � − σ m v ( m , t ) v ( m , t ) e t =1 e 1 − σ m 1 − σ m P m , t = �� T j � 1 − σ m � m v ( m , t ) t =1 e 1 − σ m j =1 v ( m , t ) e 1 − σ m P t | m = � T m v ( m , t ) t =1 e 1 − σ m �� T m � 1 − σ m v ( m , t ) t =1 e 1 − σ m P m = � 1 − σ m = P ( B m ) �� T j � m v ( m , t ) t =1 e 1 − σ m j =1 Heckman Classical Discrete Choice Theory

• Standard type of utility function that people might use v ( m , t ) = z ′ t γ + x ′ mt β + y ′ m α Heckman Classical Discrete Choice Theory

• z ′ t is transportation mode characteristics, x ′ mt is interactions and y ′ m is neighborhood characteristics. • Then ( z ′ t γ + x ′ mt β ) e 1 − σ m �� T m � P t | m = ( z ′ t γ + x ′ mt β ) t =1 e 1 − σ m �� T m � 1 − σ m ( z ′ t γ + x ′ mt β ) e y ′ m α t =1 e 1 − σ m P m = �� T m � 1 − σ j ( z ′ t γ + x ′ mt β ) � m j =1 e y ′ m α t =1 e 1 − σ j Heckman Classical Discrete Choice Theory

• Estimation (in two steps) (see Amemiya, Chapter 9) • Let � T m ( z ′ t γ + x ′ mt β ) I m = e 1 − σ m t =1 Heckman Classical Discrete Choice Theory

� γ � β 1 Within each neighborhood, get 1 − σ m and 1 − σ m by logit 2 Form � I m 3 Then estimate by MLE m α +(1 − σ m ) ln � e y ′ I m get � α, � σ m � m m α +(1 − σ j ) ln � j =1 e y ′ I j • Assume σ m = σ j ∀ j , m or at least need some restrictions across multiple neighborhoods? • Note: � I m is an estimated regressor (“Durbin problem”) • Need to correct standard errors Heckman Classical Discrete Choice Theory

Multinomial Probit Models 1 Also known as: 1 Thurstone Model V (1929; 1930) 2 Thurstone-Quandt Model 3 Developed by Domencich-McFadden (1978) (on reading list) u i = v i + η i i = 1 , ..., J v i = Z i β (linear in parameters form) u i = Z i β + η i MNL MNP � ¯ � ( i ) β fixed ( i ) β random coefficient β ∼ N β, Σ β ( ii ) η i iid ( ii ) β independent of η η ∼ (0 , Σ η ), • Allow gen. forms of correlation between errors Heckman Classical Discrete Choice Theory

� � u i = Z i ¯ β − ¯ β + Z i β + η i � � • ( β − ¯ β − ¯ β ) = ε and Z i β + η i is a composite heteroskedastic error term. • β random = taste heterogeneity, • η i can interpret as unobserved attributes of goods • Main advantage of MNP over MNL is that it allows for general error covariance structure. • Note: To make computation easier, users sometimes set Σ β = 0 (fixed coefficient version) • allowing for β random • permits random taste variation • allows for possibility that different persons value 2 characteristics differently Heckman Classical Discrete Choice Theory

Problem of Identification and Normalization in the MNP Model • Reference: David Bunch (1979), “Estimability In the Multinominal Probit Model” in Transportation Research • Domencich and McFadden • Let     Z 1 · ¯ β η 1 J alternatives  .   .  Z ¯ . . β = ˜ η = K characteristics     . . Z J · ¯ β random β ∼ N ( β, Σ β ) β η J (2) Heckman Classical Discrete Choice Theory

Problem of Identification and Normalization in the MNP Model • Pr (alternative j selected): = Pr ( u j > u i ) ∀ i � = j u j u j � ∞ � � = Φ ( u | V µ , Σ µ ) du J du l du j u j = −∞ u i = −∞ u J = −∞ where Φ ( u | V µ , Σ µ ) is pdf (Φ is J -dimensional MVN density with mean V µ , Σ µ ) • Note: Unlike the MVL, no closed form expression for the integral. • The integrals often evaluated using simulation methods (we will work an example). Heckman Classical Discrete Choice Theory

How many parameters are there? • ¯ β : K parameters • Σ β : K × K symmetric matrix K 2 − K + K = K ( K +1) 2 2 J ( J +1) • Σ η : 2 • Note: When a person chooses j , all we know is relative utility, not absolute utility. • This suggests that not all parameters in the model will be identified. • Requires normalizations. Heckman Classical Discrete Choice Theory

Digression on Identification • What does it mean to say a parameter is not identified in a model? • Model with one parameterization is observationally equivalent to another model with a different parameterization Heckman Classical Discrete Choice Theory

Digression on Identification • Example: Binary Probit Model (fixed β ) Pr ( D = 1 | Z ) = Pr ( v 1 + ε 1 > v 2 + ε 2 ) = Pr ( x β + ε 1 > x 2 β + ε 2 ) = Pr (( x 1 − x 2 ) β > ε 2 − ε 1 ) � ( x 1 − x 2 ) β � > ε 2 − ε 1 = Pr σ σ � ˜ � x β = Φ x = x 1 − x 2 ¯ σ � ˜ � � ˜ � x β x β ∗ for β σ = β ∗ • Φ is observationally equivalent to Φ σ ∗ . σ σ ∗ Heckman Classical Discrete Choice Theory

• β not separably identified relative to σ but ratio is identified: � ˜ � � ˜ � x β ∗ x β Φ = Φ σ ∗ σ � ˜ � ˜ � � x β ∗ x β Φ − 1 · Φ Φ − 1 Φ = σ ∗ σ σ = β ∗ β ⇒ σ ∗ • Set { b : b = β · δ, δ any positive scalar } is identified (say “ β is identified up to scale and sign is identified”). Heckman Classical Discrete Choice Theory

Identification in the MVP model Pr ( j selected | V µ , Σ µ ) = Pr ( u i − u j < 0 ∀ i � = j )   1 0 .. − 1 .. 0   0 1 .. − 1 .. 0   Define ∆ j = (contrast matrix)   : : : 0 .. .. − 1 0 1 ( J − 1) × J   u ′ − u j   ∆ j ˜ u = : u J − u j Heckman Classical Discrete Choice Theory

Identification in the MVP model Pr ( j selected | V µ , Σ µ ) = Pr (∆ j ˜ u < 0 | V µ , Σ µ ) = Φ (0 | V Z , Σ Z ) • Where u = ∆ j ˜ Z ¯ 1 V Z is the mean of ∆ j ˜ β 2 Σ Z is the variance of ∆ j ˜ Z Σ β ˜ Z ′ ∆ ′ j + ∆ j Σ η ∆ ′ j 3 V Z is ( J − 1) × 1 4 Σ Z : ( J − 1) × ( J − 1) • We reduce dimensions of the integral by one. Heckman Classical Discrete Choice Theory

• This says that all of the information exists in the contrasts. • Can’t identify all the components because we only observe the contrasts. • Now define ˜ ∆ j as ∆ j with J th column removed and choose J as the reference alternative with corresponding ∆ J . • Then can verify that ∆ j = ˜ ∆ j · ∆ J Heckman Classical Discrete Choice Theory

• For example, with three goods: � 1 � � 1 � � 1 � − 1 0 − 1 − 1 0 × = 0 − 1 0 1 1 0 − 1 1 ˜ • ∆ j , ( j = 2 , ∆ J , ( J = 3 , ∆ j , ( j = 2 , 3rd column included) 3rd column reference alt.) removed) Heckman Classical Discrete Choice Theory

• Therefore, we can write ∆ j ˜ Z ¯ V Z = β ∆ j ˜ Z Σ β ˜ j + ˜ J ˜ Z ′ ∆ ′ ∆ j ∆ J Σ η ∆ ′ ∆ ′ Σ Z = j • where C J = ∆ J Σ η ∆ ′ J and ( J − 1) × ( J − 1) has ( J − 1) 2 − ( J − 1) + ( J + 1) parameters = J ( J − 1) total. 2 2 • Since original model can always be expressed in terms of a model with ( β, Σ β , C J ) , it follows that some of the parameters in the original model are not identified. Heckman Classical Discrete Choice Theory

How many parameters not identified? • Original model: K + K ( K + 1) + J ( J + 1) 2 2 • Now: J 2 + J − ( J 2 − J ) K + K ( K + 1) + J ( J − 1) , 2 2 2 = J not identified • Turns out that one additional parameter not identified. • Total: J + 1 • Note : Evaluation of Φ (0 | kv Z , k 2 Σ Z ) k > 0 gives same result as evaluating Φ (0 | v Z , Σ Z ) can eliminate one more parameter by suitable choice of k . Heckman Classical Discrete Choice Theory

Illustration   σ 11 σ 12 σ 13   J = 3 Σ η = σ 21 σ 22 σ 23 σ 31 σ 32 σ 33 � 1 � 1 � � ′ − 1 0 − 1 0 C 2 = ∆ 2 Σ η ∆ ′ 2 = · Σ η 0 − 1 1 0 − 1 1 � σ 11 � − 2 σ 21 + σ 22 , σ 21 − σ 31 − σ 32 + σ 22 = σ 21 − σ 31 − σ 32 + σ 22 , σ 33 − 2 σ 31 + σ 22 Heckman Classical Discrete Choice Theory

Illustration � 1 � − 1 C 2 = ˜ ∆ 2 ∆ 3 Σ η ∆ ′ 3 ∆ ′ 2 = · 0 − 1 � σ 11 � − 2 σ 21 + σ 33 , σ 21 − σ 31 − σ 32 + σ 33 · σ 21 − σ 31 − σ 32 + σ 33 σ 22 − 2 σ 32 σ 33 � � 1 0 − 1 − 1 Heckman Classical Discrete Choice Theory

Normalization Approach of Albreit, Lerman, and Manski (1978) • Note: Need J + 1 restrictions on VCV matrix. • Fix J parameters by setting last row and last column of Σ η to 0 • Fix scale by constraining diagonal elements of Σ η so that trace Σ ε J equals variance of a standard Weibull. (To compare estimates with MNL and independent probit) Heckman Classical Discrete Choice Theory

How do we solve the forecasting problem? • Suppose that we have 2 goods and add a 3rd � � u 1 − u 2 ≥ 0 Pr (1 chosen) = Pr �� Z 1 − Z 2 � ¯ β ≥ ω 2 − ω 1 � = Pr 1 • where ω 1 = Z 1 � � ω 2 = Z 2 � � β − ¯ β − ¯ + η 1 , + η 2 β β ( Z 1 − Z 2 ) ¯ � β 1 1 / 2 [ σ 11+ σ 22 − 2 σ 12+ ( Z 2 − Z 1 ) Σ η ( Z 2 − Z 1 ) ′ ] e − t / 2 dt = √ 2 π −∞ • Now add a 3rd good β + Z 3 � � u 3 = Z 3 ¯ β − ¯ + η 3 . β Heckman Classical Discrete Choice Theory

• Problem : We don’t know correlation of η 3 with other errors. • Suppose that η 3 = 0 ( i.e. only preference heterogeneity). Then � a � b Pr (1 chosen) = B . V . N . dt 1 dt 2 −∞ −∞ � Z 1 − Z 2 � ¯ β when a = � σ 11 + σ 22 − 2 σ 12 + ( Z 2 − Z 1 ) Σ β ( Z 2 − Z 1 ) ′ � 1 / 2 � Z 1 − Z 3 � ¯ β and b = � σ 11 + ( Z 3 − Z 1 ) Σ β ( Z 3 − Z 1 ) ′ � 1 / 2 • We could also solve the forecasting problem if we make an assumption like η 2 = η 3 . • We solve red-bus//blue-bus problem if η 2 = η 1 = 0 and z 3 = z 2 . Heckman Classical Discrete Choice Theory

� � u 1 − u 2 ≥ 0 , u 1 − u 3 ≥ 0 Pr (1 chosen) = Pr • but u 1 − u 2 ≥ 0 ∧ u 1 − u 3 ≥ 0 are the same event. • ∴ adding a third choice does not change the choice of 1 . Heckman Classical Discrete Choice Theory

Estimation Methods for MNP Models • Models tend to be difficult to estimate because of high dimensional integrals. • Integrals need to be evaluated at each stage of estimating the likelihood. • Simulation provides a means of estimating P ij = Pr ( i chooses j ) Heckman Classical Discrete Choice Theory

Computation and Estimation Link to Appendix Heckman Classical Discrete Choice Theory

Classical Models for Estimating Models with Limited Dependent Variables References: • Amemiya, Ch. 10 • Different types of sampling (previously discussed) (a) random sampling (b) censored sampling (c) truncated sampling (d) other non-random (exogenous stratified, choice-based) Heckman Classical Discrete Choice Theory

Standard Tobit Model (Tobin, 1958) “Type I Tobit” y ∗ i = x i β + u i • Observe y ∗ if y ∗ i ≥ y 0 or y i = 1 ( y ∗ i ≥ y 0 ) y ∗ y i = i i if y ∗ y i = 0 i < y 0 • Tobin’s example-expenditure on a durable good only observed if good is purchased Heckman Classical Discrete Choice Theory

Figure 1 expenditure x x x y x x 0 x x x x x individuals Note: Censored observations might have bought the good if price had been lower. • Estimator. Assume y ∗ i / x i ∼ N (0 , σ 2 y ∗ i / x i ∼ N ( x i β, σ 2 u ) u ) Heckman Classical Discrete Choice Theory

Density of Latent Variables g ( y ∗ ) = π 0 Pr ( y ∗ i < y 0 ) + π 1 f ( y ∗ i | y i ≥ y 0 ) · Pr ( y ∗ i ≥ y 0 ) � u i � � y 0 − x i β � < y 0 − x i β Pr ( y ∗ i < y 0 ) = Pr ( x i β + u i < y 0 ) = Pr = Φ σ u σ u σ u � y ∗ � i − x i β 1 σ u φ σ u f ( y ∗ i | y ∗ � � why? i ≥ y 0 ) = y 0 − x i β 1 − Φ σ u Pr ( y ∗ = y ∗ i | y 0 ≤ y ∗ ) = Pr ( x β + u = y ∗ i | y 0 ≤ x β + u ) � u � = y ∗ i − x β | u ≥ y 0 − x β Pr σ u σ u σ u σ u Heckman Classical Discrete Choice Theory

• Note that likelihood can be written as: � � � y 0 − x i β � � � y 0 − x i β �� y ∗ i − x i β 1 σ u φ σ u � � �� L = Π 0 Φ Π 1 1 − Φ Π 1 σ u σ u y 0 − x i β 1 − Φ � �� σ u � �� This part you would set with just a simple probit Additional information • You could estimate β up to scale using only the information on whether y i � y 0 , but will get more efficient estimate using additional information. * if you know y 0 , you can estimate σ u . Heckman Classical Discrete Choice Theory

Truncated Version of Type I Tobit Observe y i = y ∗ i if y ∗ i > o � observe nothing for censored observations � example: only observe wages for workers � � y ∗ i − x i β 1 σ u φ σ u � � Z = Π 1 x i β Φ σ u Pr ( y ∗ i > 0) = Pr ( x β + u > 0) � u � > − x β = Pr σ u σ u � � u < x β = Pr σ u Heckman Classical Discrete Choice Theory

Different Ways of Estimating Tobit β (a) if censored, could obtain estimates of σ u by simple probit (b) run OLS on observations for which y ∗ i is observed � u i � | u i > − x β E ( y i | x i β + u i ≥ 0) = x i β + σ u E ( y 0 = 0) σ u σ u σ u • where E ( y i | x i β + u i ≥ 0) is the conditional mean for truncated normal r.v and � � � u i � � x i β � − x β φ | u i > − x β σ u � � σ u E − → λ = σ u σ u σ u σ u π i β Φ σ u � � x i β • λ known as “Mill’s ratio” ; bias due to censoring, can be σ u viewed as an omitted variables problem Heckman Classical Discrete Choice Theory

Heckman Two-Step procedure β • Step 1: estimate σ u by probit • Step 2: � � x i ˆ β form ˆ λ σ regress � x i β � x i β + σ ˆ y i = λ + v + ε σ � � x i β � � x i β �� − ˆ v = σ λ λ σ σ ε = u i − E ( u i | u i > x i β ) Heckman Classical Discrete Choice Theory

• Note: errors (v+e) will be heteroskedatic; • need to account for fact that λ is estimated (Durbin problem) • Two ways of doing this: (a) Delta method (b) GMM (Newey, Economic Letters, 1984) (c) Suppose you run OLS using all the data � � u i �� > − x i β | u i E ( y i ) = Pr ( y ∗ i ≤ 0) · 0 + Pr ( y ∗ i > 0) x i β + σ u E σ u σ u σ � x i β � � � x i β �� =Φ x i β + σ u λ σ σ • could estimate model by replacing Φ with ˆ φ and λ with ˆ λ. • For both (b) and (c), errors are heteroskedatic, meaning that you could use weights to improve efficiency. • Also need to adjust for estimated regressor. (d) Estimate model by Tobit maximum likelihood directly. Heckman Classical Discrete Choice Theory

Variations on Standard Tobit Model y ∗ = x 1 i β + u 1 i 1 i y ∗ = x 2 i β + u 2 i 2 i y ∗ y ∗ y 2 i = if 1 i ≥ 0 2 i = 0 else • Example • y 2 i student test scores • y ∗ 1 i index representing parents propensity to enroll students in school • Test scores only observed for proportion enrolled Heckman Classical Discrete Choice Theory

L =Π 1 [Pr ( y ∗ 1 i > 0) f ( y 2 i | y ∗ 1 i > 0)] Π 0 [Pr ( y ∗ 1 i ≤ 0)] � ∞ 0 f ( y ∗ 1 i , y ∗ 2 i ) dy ∗ 1 i f ( y ∗ 2 i | y ∗ � ∞ 1 i ≥ 0) = 0 f ( y ∗ 1 i ) dy ∗ 1 i � ∞ 0 f ( y ∗ 1 i | y ∗ 2 i ) dy ∗ = f ( y 2 i ) 1 i � ∞ 0 f ( y ∗ 1 i ) dy ∗ 1 i � ∞ � y ∗ � 0 f ( y ∗ 1 i | y ∗ 2 i ) dy ∗ = 1 2 i − x 2 i β 2 1 i σ 2 φ · σ 2 Pr ( y ∗ 1 i > 0) � x 1 i β 1 , σ 2 � y 1 i ∼ N y 2 i ∼ N ( x 2 i β 2 , ) Heckman Classical Discrete Choice Theory

� � x 1 i β 1 + σ 12 1 − σ 12 y ∗ 1 i | y ∗ ( y 2 i − x 2 i β 2 ) , σ 2 2 i ∼ N σ 2 σ 2 2 2 E ( y ∗ 1 i | u 2 i = y ∗ 2 i − x 2 i β ) = x 1 i β 1 + E ( u 1 i | u 2 i = y ∗ 2 i − x 2 i β ) Heckman Classical Discrete Choice Theory

Estimation by MLE � � x 1 i β �� y ∗ � 1 2 i − x 2 i β 2 L = Π 0 1 − Φ Π 1 · φ σ 1 σ 2 σ 2 � �     x 1 i β 1 + σ 12   − 2 ( y 2 i − x 2 i β 2 ) σ 2   ·  1 − Φ σ x  Heckman Classical Discrete Choice Theory

Estimation by Two-Step Approach • Using data on y 2 i for which y 1 i > 0 E ( y 2 i | y 1 i > 0) = x 2 i β + E ( u 2 i | x i β + u 1 i > 0) � u 2 i � | u 1 i > − x 1 i β 1 = x 2 i β + σ 2 E σ 2 σ 1 σ 1 � u 1 i � σ 12 | u 1 i > − x 1 i β 1 = x 2 i β + \ σ 2 E σ 1 \ σ 2 σ 1 σ 1 σ 1 � − x i β � x 2 i β 2 + σ 12 = λ σ 1 σ Heckman Classical Discrete Choice Theory

Example: Female labor supply model max u ( L , x ) s.t. x = wH + v H = 1 − L where H : hours worked v : asset income w given P x = 1 L : time spent at home for child care ∂ u ∂ L = w when L < 1 ∂ u ∂ x reservation wage = MRS | H =0 = w R Heckman Classical Discrete Choice Theory

Example: Female labor supply model • We don’t observe w R directly. w 0 Model = x β + u (wage person would earn if they worked) w R = z γ + v w 0 i < w 0 w R w i = if i i = 0 else • Fits within previous Tobit framework if we set x β − z γ + u − v = w 0 − w R y ∗ = 1 i y 2 i = w i • Note - Gronau does not develop a model to explain hours of work. Heckman Classical Discrete Choice Theory

Incorporate choice of H w 0 = x 2 i β 2 + u 2 i given ∂ u = γ H i + z ′ ∂ L MRS = i α + v i ∂ u ∂ x (Assume functional form for utility function that yields this) Heckman Classical Discrete Choice Theory

w r ( H i = 0) z ′ = i α + v i w 0 work if = x 2 i β 2 + u 2 i > z i α + v i w 0 if work, then = MRS = ⇒ x 2 i β 2 + u 2 i = α H i + z i α + v i i H i = x 2 i β 2 − z ′ i α + u 2 i − v i = ⇒ γ = x 1 i β 1 + u 1 i ( x 2 i β 2 − z i α ) γ − 1 where x 1 i β 1 = u 1 i = u 2 i − v i Heckman Classical Discrete Choice Theory

Type 3 Tobit Model y ∗ 1 i = x 1 i β 1 + u 1 i ← − hours y ∗ 2 i = x 2 i β 1 + u 2 i ← − wage y ∗ if y ∗ y 1 i = 1 i > 0 1 i if y ∗ = 0 1 i ≤ 0 y ∗ if y ∗ y 2 i = 1 i > 0 2 i if y ∗ = 0 1 i ≤ 0 Heckman Classical Discrete Choice Theory

H ∗ H ∗ Here H i = if i > 0 i H ∗ = 0 if i ≤ 0 w 0 H ∗ w i = if i > 0 i H ∗ = 0 if i ≤ 0 • Note: Type IV Tobit simply adds y ∗ if y ∗ y 3 i = 1 i > 0 3 i if y ∗ = 0 1 i ≤ 0 Heckman Classical Discrete Choice Theory

• Can estimate by (1) maximum likelihood (2) Two-step method � � w 0 E i | H i > 0 = γ H i + z i α + E ( v i | H i > 0) Heckman Classical Discrete Choice Theory

Type V Tobit Model of Heckman (1978) y ∗ = γ y 2 i + x 1 i β + δ 2 w i + u 1 i 1 j γ 2 y ∗ y 2 i = 1 i + x 2 i β 2 + δ 2 w i + u 2 i • Analysis of an antidiscrimination law on average income of African Americans in i th state. • Observe x 1 i , x 2 i , y 2 i and w i if y ∗ w i = 1 1 i > 0 if y ∗ w i = 0 1 i ≤ 0 • y 2 i = average income of African Americans in the state • y ∗ 1 i = unobservable sentiment towards African Americans • w i = if law is in effect Heckman Classical Discrete Choice Theory

• Adoption of Law is endogenous • Require restriction γδ 2 + δ 1 = 0 so that we can solve for y ∗ 1 j as a function that does not depend on w i . • This class of models known as “dummy endogenous variable” models. Coherency Problem (Suppose Not Restricted?) Heckman Classical Discrete Choice Theory

Relaxing Parametric Assumptions in the Selection Model References: • Heckman (AER, 1990) “Varieties of Selection Bias” • Heckman (1980), “Addendum to Sample Selection Bias as Specification Error” • Heckmand and Robb (1985, 1986) y ∗ = x β + u 1 y ∗ = z γ + v 2 y ∗ if y ∗ y 1 = 2 > 0 1 Heckman Classical Discrete Choice Theory

Relaxing Parametric Assumptions in the Selection Model E ( y ∗ 1 | observed) = x β + E ( u | x , z γ + u > 0) + [ u − E ( u | x , z γ + u > 0)] � ∞ � − z γ −∞ uf ( u , v | x , z ) dvdu −∞ � ∞ � − z γ −∞ f ( uv | x , z ) dvdu −∞ • Note: Pr ( y ∗ 2 > 0 | z ) = Pr ( z γ + u > 0 | z ) = P ( Z ) = 1 − F v ( − z γ ) Heckman Classical Discrete Choice Theory

⇒ F v ( − z γ ) = 1 − P ( Z ) − z γ = F − 1 ⇒ (1 − P ( Z )) if F v v • Can replace − z γ in integrals in integrals by F − 1 (1 − P ( Z )) if v in addition f ( u , v | x , z ) = f ( u , v | z γ ) (index sufficiency) • Then E ( y ∗ 1 | y 2 > 0) = x β + g ( P ( z )) + ε where g ( P ( Z )) is bias or “control function.” • Semiparametric selection model-Approximate bias function by Taylor series in P ( z γ ) , truncated power series. Heckman Classical Discrete Choice Theory

Classical Discrete Choice Theory James J. Heckman University of - PowerPoint PPT Presentation

Classical Discrete Choice Theory James J. Heckman University of Chicago Econ 312, Spring 2019 Heckman Classical Discrete Choice Theory Classical regression model: y = x + 0 = E ( | x ) 0 , 2 I E N 1

Classical Discrete Choice Theory James J. Heckman University of Chicago Econ 312, Spring 2019

Choice Set Optimization Under Discrete Choice Models of Group Decisions Kiran Tomlinson and

Discrete choice analysis & taboos Caspar Chorus 5-6-2019 Professor of choice behavior

Creating ecient designs for discrete choice experiments Arne Risa Hole University of Sheeld

Adaptive discrete choice designs Presentation held at 2008

Theoretical foundations Microeconomic consumer theory Michel Bierlaire Introduction to choice

Theoretical foundations Ingredients of choice theory Michel Bierlaire Introduction to choice

EDAA40 EDAA40 Discrete Structures in Computer Science Discrete Structures in Computer Science

Sampling Theory The world is continuous Like it or not, images are discrete. Intro to

Sampling Theory The world is continuous Like it or not, images are discrete. Intro to

Conditional Choice Probability Estimators of Single-Agent Dynamic Discrete Choice Models Hotz

Choice Theory A Synopsis 14.123 Microeconomic Theory III Muhamet Yildiz Road map 1. Basic

AP Physics 2 C Kinetic theory Electromagnetic Waves D Classical mechanics Multiple Choice

SOCIAL CHOICE THEORY A mathematical theory that deals with aggregation of individual

CMU 15-896 Social choice 1: The basics Teacher: Ariel Procaccia Social choice theory A

Discrete Mathematics Discrete Mathematics -- Chapter 3: Set Theory Hung-Yu Kao ( )

About the course From the CSE catalog: CSE 321 Discrete Structures (4) CSE 321 Discrete

Asset Pricing Chapter III. Making Choice in Risky Situations June 20, 2006 Asset Pricing 3.1

Performing and interpreting discrete choice analyses in Stata Joerg Luedicke StataCorp LLC May

Lectures 34: Consumer Theory Alexander Wolitzky MIT 14.121 1 Consumer Theory Consumer theory

70: Discrete Math and Probability Theory Programming + Microprocessors Superpower! What are

Choice theory Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory

Toward dependent choice: a classical sequent calculus with dependent types Hugo Herbelin 1 ,

The choice between consumption and saving A theory of intertemporal choice February 1, 2018

Classical Discrete Choice Theory James J. Heckman University of - PowerPoint PPT Presentation

Classical Discrete Choice Theory James J. Heckman University of Chicago Econ 312, Spring 2019 Heckman Classical Discrete Choice Theory Classical regression model: y = x + 0 = E ( | x ) 0 , 2 I E N 1

Classical Discrete Choice Theory James J. Heckman University of Chicago Econ 312, Spring 2019

Choice Set Optimization Under Discrete Choice Models of Group Decisions Kiran Tomlinson and

Discrete choice analysis &amp; taboos Caspar Chorus 5-6-2019 Professor of choice behavior

Creating ecient designs for discrete choice experiments Arne Risa Hole University of Sheeld

Adaptive discrete choice designs Presentation held at 2008

Theoretical foundations Microeconomic consumer theory Michel Bierlaire Introduction to choice

Theoretical foundations Ingredients of choice theory Michel Bierlaire Introduction to choice

EDAA40 EDAA40 Discrete Structures in Computer Science Discrete Structures in Computer Science

Sampling Theory The world is continuous Like it or not, images are discrete. Intro to

Sampling Theory The world is continuous Like it or not, images are discrete. Intro to

Conditional Choice Probability Estimators of Single-Agent Dynamic Discrete Choice Models Hotz

Choice Theory A Synopsis 14.123 Microeconomic Theory III Muhamet Yildiz Road map 1. Basic

AP Physics 2 C Kinetic theory Electromagnetic Waves D Classical mechanics Multiple Choice

SOCIAL CHOICE THEORY A mathematical theory that deals with aggregation of individual

CMU 15-896 Social choice 1: The basics Teacher: Ariel Procaccia Social choice theory A

Discrete Mathematics Discrete Mathematics -- Chapter 3: Set Theory Hung-Yu Kao ( )

About the course From the CSE catalog: CSE 321 Discrete Structures (4) CSE 321 Discrete

Asset Pricing Chapter III. Making Choice in Risky Situations June 20, 2006 Asset Pricing 3.1

Performing and interpreting discrete choice analyses in Stata Joerg Luedicke StataCorp LLC May

Lectures 34: Consumer Theory Alexander Wolitzky MIT 14.121 1 Consumer Theory Consumer theory

70: Discrete Math and Probability Theory Programming + Microprocessors Superpower! What are

Choice theory Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory

Toward dependent choice: a classical sequent calculus with dependent types Hugo Herbelin 1 ,

The choice between consumption and saving A theory of intertemporal choice February 1, 2018

Discrete choice analysis & taboos Caspar Chorus 5-6-2019 Professor of choice behavior