mect microeconometrics blundell lecture 3 selection models
play

MECT Microeconometrics Blundell Lecture 3 Selection Models Richard - PowerPoint PPT Presentation

MECT Microeconometrics Blundell Lecture 3 Selection Models Richard Blundell http://www.ucl.ac.uk/~uctp39a/ University College London February-March 2015 Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 1 / 19 The


  1. MECT Microeconometrics Blundell Lecture 3 Selection Models Richard Blundell http://www.ucl.ac.uk/~uctp39a/ University College London February-March 2015 Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 1 / 19

  2. The Selectivity Model Generalises the censored regression model by specifying mixture of discrete and continuous processes. I Extends the ‘corner solution’ model to cover models with …xed costs. I Extends to cover the case of the heterogeneous treatment e¤ect models. Write the latent process for the variable of interest as y � 1 i = x 0 1 i β 1 + u 1 i with E ( u 1 j x 1 ) = 0. The observation rule for y 1 is given by � y � if y � 2 i > 0 1 i y 1 i = 0 otherwise where y � 2 i = x 0 2 i β 2 + u 2 i and � 1 if y � 2 i > 0 y 2 i = 0 otherwise as in the Probit model. Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 2 / 19

  3. Consider the selected sample with y � 2 i > 0, OLS is biased as we know E ( u 1 i j y � 0 ) = E ( u 1 i j x 0 > 2 i β 2 + u 2 i ) 2 i E ( u 1 i j u 2 i > � x 0 = 2 i β 2 ) 6 = 0 , if u 1 and u 2 are correlated. I Suppose to begin with we assume ( u 1 , u 2 ) are jointly normally distributed with mean zero and constant covariance matrix, � u 1 � �� 0 � � σ 11 �� σ 12 v N , . u 2 0 σ 21 1 I We can write the orthogonal decomposition of u 1 given u 2 as u 1 i = σ 12 u 2 i + ε 1 i where ε 1 is distributed independently of u 2 and has a marginal normal distribution. Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 3 / 19

  4. Substituting we have E ( u 1 i j y � 0 ) = E ( σ 12 u 2 i + ε 1 i j u 2 i > � x 0 > 2 i β 2 ) 2 i σ 12 E ( u 2 i j u 2 i > � x 0 2 i β 2 ) + E ( ε 1 i j u 2 i > � x 0 = 2 i β 2 ) σ 12 E ( u 2 i j u 2 i > � x 0 = 2 i β 2 ) I From last lecture we have the conditional mean for the truncated normal Z ∞ E ( w j w > c ) = wf ( w j w > c ) dw c � c � h � w �i ∞ φ σ σ � c � � c � = � φ c = σ 1 � Φ 1 � Φ σ σ σ Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 4 / 19

  5. Noting that σ 22 � 1, we have E ( u 1 i j y � 0 ) = σ 12 E ( u 2 i j u 2 i > � x 0 > 2 i β 2 ) 2 i φ ( � x 0 2 i β 2 ) = σ 12 1 � Φ ( � x 0 2 i β 2 ) φ ( x 0 2 i β 2 ) = σ 12 Φ ( x 0 2 i β 2 ) � � x 0 = σ 12 λ 2 i β 2 . I In general provided we have this linear index speci…cation � � E ( u 1 i j y � x 0 2 i > 0 ) = g 2 i β 2 . I Implying that selection is simply a function of the single index in the selection equation x 0 2 i β 2 , even when joint normality can not be assumed. However, note the restrictiveness of the single linear index speci…cation. Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 5 / 19

  6. I Given this result for the joint normal linear index selection model we can easily derive the familiar Heckman and Maximum Likelihood estimators. The selection model can now be rewritten: � � + ε 1 i y � 1 i = x 0 x 0 1 i β 1 + σ 12 λ 2 i β 2 with E ( ε 1 j x 1 , x 2 ) = 0 and E ( ε 2 1 j x 1 , x 2 ) = ω 11 . The observation rule for y 1 is given by � y � if y � 2 i > 0 1 i y 1 i = 0 otherwise where y � 2 i = x 0 2 i β 2 + u 2 i as before. Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 6 / 19

  7. I We can write the log-likelihood to mirror this conditional speci…cation as and the loglikelihood contribution for observation i is ( � � �� ) � ( y 1 i � x 0 1 i β 1 � σ 12 λ ( x 0 2 i β 2 )) 2 1 D i ln p 2 πω 11 exp + ln l i ( β 1 , β 2 , ω 11 , σ 12 ) = 2 ω 11 D i ln Φ ( x 0 2 i β 2 ) + ( 1 � D i ) ln [ 1 � Φ ( x 0 2 i β 2 )] � � ( � ( y 1 i � x 0 1 i β 1 � σ 12 λ ( x 0 2 i β 2 )) 2 N D i ln + ∑ ω 11 ln L N ( β 1 , β 2 , ω 11 , σ 12 ) = D i ln Φ ( x 0 2 i β 2 ) + ( 1 � D i ) ln [ 1 � Φ ( x 0 2 i β 2 )] i = 1 Notice that β 1 , ω 11 , σ 12 do not occur in the second part of this expression so there is a natural partition of the loglikelihood into the binary model for selection that estimates β 2 and the conditional model on the selected sample. Thus we have the Heckman selectivity estimator or Heckit. Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 7 / 19

  8. I The Heckit estimator is the …rst round of a full MLE estimation which produces consistent but not fully e¢cient estimators. First estimate β 2 by Probit. Then, condition on β 2 , estimate β 1 , ω 11 , σ 12 from the least squares estimation of the conditional model on the selected sample. Can clearly go on to produce the MLE estimators. Stata allows either option. I Note that the LM or Score test can be constructed directly by including λ ( x 0 2 i β 2 ) in the selected regression and testing the coe¢cient. This is a one degree of freedom score test so that a t-test can be used. Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 8 / 19

  9. I Advantages of the Normal Selection Model: (i) avoids the Tobit assumption. (ii) 2-step Heckit estimator is straightforward. (iii) t-test of the null hypothesis H 0 : σ 12 = 0 , i.e. no selectivity bias, can be constructed easily. I Disadvantages: (i) assumes joint normality (ii) need to allow for the estimated β 2 in λ ( x 0 2 i β 2 ) . Typically easiest to compute full MLE and use the usual formula for correct standard errors. Note that the t-test of selectivity bias can be carried out without this extra computation because the test statistic is valid under the null hypothesis H 0 . (iii) need λ ( x 0 2 i β 2 ) to vary independently of x 0 1 i β 1 . Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 9 / 19

  10. I The requirement that λ ( x 0 2 i β 2 ) varies independently of x 0 1 i β 1 is strictly one of nonparametric identi…cation since, in the parametric joint normal case for example, λ is a nonlinear function given by φ ( x 0 2 i β 2 ) 2 i β 2 ) and is not Φ ( x 0 perfectly collinear with x 0 1 i β 1 even if exactly the same variables are in x 1 and x 2 . I However, even in the joint normal case φ ( x 0 2 i β 2 ) 2 i β 2 ) can be approximately Φ ( x 0 linear over large ranges of x 0 2 i β 2 . In general, identi…cation requires an exclusion restriction just as in the standard endogenous regressor case. This is really a triangular structure for a simultaneous model. D i is a single endogenous variable in the structural model for y 1 . The order condition requires that at least one exogenous variable is excluded for each included rhs endogenous variable. Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 10 / 19

  11. When we are unwilling to assume a parametric distribution for u 1 and u 2 then the identi…cation arguments becomes even more clear. I As we noted above, given the linear index structure, the selection model can still be written: � � + ε 1 i y 1 i = x 0 x 0 1 i β 1 + g 2 i β 2 for y 1 i observed and with E ( ε 1 j x 1 , x 2 ) = 0 and (maybe) E ( ε 2 1 j x 1 , x 2 ) = ω 11 . I But if we do not know the form of g , perfect collinearity can occur if there is no exclusion restriction. Indeed, in general we will need to exclude a continuous ‘instrumental’ variable. I Often this lines up well with the economic problem being addressed. For example, wages and employment. In this case the excluded instrument is nonlabour income. This determines employment but not wages, at least in the static competitive model. Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 11 / 19

  12. Think of other cases: prices …rms set across di¤erent markets, the instrument maybe local costs; occupational choice and earnings? Notice the Tobit structure did not need such an exclusion restriction even when nonlinearity was relaxed. Does selection matter? Empirical examples include Blundell, Reed and Stoker, AER 2003. Try the Mroz data? Does relaxing joint normality matter? Some evidence it does......see the Newey, Powell and Walker AER (1990) and references therein. But need relatively large sample sizes to provide precision in semiparametric extensions. Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 12 / 19

  13. Semiparametric Methods: � � + ε 1 i y 1 i = x 0 x 0 1 i β 1 + g 2 i β 2 for y 1 i observed. two-step methods (analogous to the Heckit estimator) Quasi-maximum likelihood estimators (analogous to Klien-Spady) Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 13 / 19

  14. Semiparametric Methods: (i) Two-Step methods? 1. Estimate β 2 , say by maximum score. 2. Estimate β 1 , given b β 2 . At the second stage there are also a number of possibilities. One attractive approach is simply to use a series approximation to g ( x 0 2 i β 2 ) � � J y 1 i = x 0 x 0 2 i b ∑ 1 i β 1 + η j ρ j β 2 + ǫ 1 i j = 1 where � � � � � � j � 1 2 i b x 0 x 0 x 0 ρ j β 2 = λ 2 i β 2 . 2 i β 2 e.g. for J = 3, estimate on the selected sample only: � � � � 2 i b 2 i b 2 i b x 0 x 0 x 0 . x 0 y 1 i = 1 i β 1 + η 1 λ β 2 + η 2 λ β 2 β 2 � � � � 2 2 i b 2 i b x 0 x 0 + η 2 λ β 2 . β 2 + ǫ 1 i . Blundell ( University College London ) MECT2 Lecture 8 February-March 2015 14 / 19

Recommend


More recommend