MECT Microeconometrics Blundell Lecture 2 Censored Data Models Richard Blundell http://www.ucl.ac.uk/~uctp39a/ University College London February-March 2016 Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 1 / 29
Censored Data Models � Censored and truncated data Examples: earnings hours of work (mroz.dta is a ‘typical’ data set to play with) top coding of wealth expenditure on cars (this was James Tobin’s original example which became know as Tobin’s Probit model or the Tobit model.) � Typical definitions: Censored data includes the censoring points Truncated data excludes the censoring points Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 2 / 29
� A mixture of discrete and continuous processes. In general we should model the process of censoring or truncation as a separate discrete mechanism, i.e. the ‘selectivity’ model. � To begin with we have a model in which the two processes are generated from the same underlying continuous latent variable model e.g. corner solution models in economics. y ∗ i = x � i β + u i with � y ∗ if y ∗ i > 0 i y i = 0 otherwise or � y ∗ if u i > − x i β i y i = 0 otherwise � Sometimes also define D i � 1 if y ∗ i > 0 D i = 0 otherwise Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 3 / 29
The general specification for the censored regression model is y ∗ = x i β + u i i max { 0 , y ∗ = i } y i where y ∗ is the unobservable underlying process (similar to what was used with discrete choice models) and y is the data observation. � When u are normally distributed - u | x ∼ N ( 0 , σ 2 ) - the model is the Tobit model. � Note that � x � β � P ( y > 0 | x ) = P ( u > − x � β | x ) = Φ σ Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 4 / 29
� Consider the moments of the truncated normal. � Assume w � N ( 0 , σ ) . Then w | w > c where c is an arbitrary constant, is a truncated normal. � The density function for the truncated normal is: f ( w ) f ( w | w > c ) = 1 − F ( c ) � w � σφ σ � c � = 1 − Φ σ where f is the density function of w and F is the cumulative density function of w . Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 5 / 29
� We can now write � ∞ E ( w | w > c ) = wf ( w | w > c ) dw c � c � φ σ � c � = σ 1 − Φ σ Applying this result to the regression model yields � � x � β φ σ E ( y | x , y > 0 ) = x � β + E ( u | u > − x � β ) = x � β + σ � � x � β Φ σ � Note that φ ( w ) / Φ ( w ) is the Inverse Mills Ratio, usually written λ ( w ) . � Also note that, contrary to the discrete choice models, the variance of the residual plays a central role here: it determines the size of the partial effects. Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 6 / 29
OLS Bias � Truncated Data: � Suppose one uses only the positive observations to estimate the model and the unobservables are normally distributed. Then, we have seen that, � x � β � E ( y | x , y > 0 ) = x � β + σλ σ where the last term is E ( u | x , u > − x � β ) , which is generally non-zero. � A model of the form: y = x � β + σλ + v would have E ( v | x , y > 0 ) = 0 . � This implies the inconsistency of OLS: omitted variable problem. Thus, the resulting error term will be correlated with x . Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 7 / 29
Censored Data: � Now suppose we use all observations, both positive and zero. � Under normality of the residual, we obtain, � x � β � � x � β � x � β + σφ E ( y | x ) = Φ σ σ � Thus, once again the OLS estimates will be biased and inconsistent. Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 8 / 29
The Maximum Likelihood Estimator � Let { ( y i , x i ) , i = 1 , ..., N } be a random sample of data on the model. The contribution to the likelihood of a zero observation is determined by, � x � � i β P ( y i = 0 | x i ) = 1 − Φ σ The contribution to the likelihood of a non-zero observation is determined by, � y i − x � � f ( y i | x i ) = 1 i β σφ σ which is not invariant to σ . Thus, the overall contribution of observation i to the loglikelihood function is, � � x � �� i β 1 − Φ ln l i ( x i ; β , σ ) = 1 ( y i = 0 ) ln σ � 1 � y i − x � �� i β + 1 ( y i = 1 ) ln σφ σ Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 9 / 29
and the sample loglikelihood is, � � x � �� i β 1 − Φ N ( 1 − D i ) ln σ � � y i − x � � � ∑ ln L N ( β , σ ) = i β + D i ln φ − ln σ i = 1 σ where D equals one when y ∗ > 0 and equals zero otherwise. � Notice that both β and σ are separately identified. Moreover, if D = 1 for all i , the ML and the OLS estimators will be the same. � FOC � x � � i β σφ N ∂ ln L 1 σ D i ( y i − x � ∑ � x � � x i = i β ) x i − ( 1 − D i ) σ 2 ∂β i β 1 − Φ i = 1 σ � x � � � i β � ( y i − x � x i βφ N i β ) 2 ∂ ln L 1 σ ∑ � � x � �� + D i = ( 1 − D i ) − ∂σ 2 2 σ 4 2 σ 2 i β 2 σ 2 1 − Φ i = 1 σ Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 10 / 29
Or write as: ( 1 ) ∂ ln L 1 σφ i x i + 1 − ∑ ( y i − x � σ 2 ∑ = i β ) x i σ 2 ∂β 1 − Φ i i ∈ 0 i ∈ + x i βφ i ( 2 ) ∂ ln L 1 1 i β ) 2 − N + ( y i − x � 2 σ 2 ∑ 2 σ 4 ∑ = + ∂σ 2 1 − Φ i 2 σ 2 i ∈ 0 i ∈ + β � note that 2 σ 2 x (1) + (2) → 1 σ 2 = ( y i − x � i β ) 2 N + ∑ � i ∈ + that is the positive observations only contribute to the estimation of σ . Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 11 / 29
� Also if we define m i ≡ E ( y ∗ i | y i ) then we can write (1) as N ∂ ln L ∑ x i ( m i − x � = c i β ) ∂β i = 1 or N N x i x � ∑ ∑ x i m i = i β i = 1 i = 1 which defines an EM algorithm for the Tobit model. Note also that � y ∗ if y ∗ i > 0 m i = φ i x � i β − σ otherwise 1 − Φ i again replacing y ∗ with its best guess, given y , when it is unobserved. � Using the Theorems 1 and 2 from Lecture 6, MLE of β and σ 2 is consistent and asymptotically normally distributed. Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 12 / 29
� Exercise: Derive the asymptotic covariance matrix from the expected values of the 2nd partial derivatives of ln L . � Note is has the general form � � � � E ∂ 2 ln L E ∂ 2 ln L ∑ N ∑ N i = 1 a i x i x � i b i x i ∂β 2 ∂β∂σ 2 i − = E ∂ 2 ln L ∑ N . i = 1 c i . ∂σ 2 Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 13 / 29
LM or Score Test � Let the log likelihood be written ln L ( θ 1 , θ 2 ) where θ 1 is the set of parameters that are unrestricted under the null hypothesis and θ 2 are k 2 restricted parameters under H 0 . H 0 : θ 2 = 0 H 1 : θ 2 � = 0 � e.g. y ∗ i = x � 1 i β 1 + x � 2 i β 2 + u i with u i ∼ N ( 0 , σ 2 ) . 1 , σ 2 ) � and θ 2 = β 2 . where θ 1 = ( β � Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 14 / 29
∂ ln L ( θ 1 , θ 2 ) ∂ ln l i ( θ 1 , θ 2 ) = ∑ ∂θ ∂θ or S ( θ ) = ∑ S i ( θ ) � Let � θ be the MLE under H 0 . Then 1 θ ) ∼ a N ( 0 , H ) S ( � √ N therefore 1 θ ) ∼ a χ 2 N S ( � θ ) � H − 1 S ( � ( k 2 ) Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 15 / 29
In the Tobit model consider the case of H 0 : β 2 = 0 ∂ ln L ( 1 − D i ) σ i φ i = 1 i β ) x 2 i − 1 D i ( y i − x � σ 2 ∑ σ 2 ∑ x 2 i ∂β 2 1 − Φ i i i ∂ ln L = 1 e ( 1 ) σ 2 ∑ x 2 i i ∂β 2 i where i β ) + ( 1 − D i )( − σ i φ i e ( 1 ) = D i ( y i − x � ) i 1 − Φ i is known as the first order ‘generalised’ residual , which reduces to u i = y i − x � i β in the general linear model case. Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 16 / 29
This kind of Score or LM test can be extended to specification tests for heteroskedasticity and for non-normality. Notice that is estimation under the alternative is avoided, at least in terms of the test statistic. If H 0 is rejected then estimation under H a is unavoidable. � Consider the normal distribution � � u 2 1 − 1 i f ( u i ) = √ exp σ 2 2 σ 2 π can be written in terms of log scores ∂ ln f ( u i ) = − u i σ 2 . ∂ u i � A popular generalisation ( Pearson family of distributions) is Blundell ( University College London ) ECONG107: Blundell Lecture 2 February-March 2016 17 / 29
Recommend
More recommend