Econometrics 1: IV, GMM and MLE James A. Duffy 1 Oxford, Michaelmas 2016 (revised: 28/12/16) 1 I thank N. Geesing, L. Freund, K. Kuske, and E. Munro for comments. The manuscript was prepared with L YX 2.2.2.
Contents 1 Instrumental variables 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2.1 Rank and order conditions . . . . . . . . . . . . . . . . . . . . . 1 1.2.2 A restatement of the rank condition . . . . . . . . . . . . . . . . 3 1.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3.1 From identification to estimation . . . . . . . . . . . . . . . . . 3 1.3.2 Asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 The ‘exclusion restriction’ . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5 Another way of computing the 2SLS estimator . . . . . . . . . . . . . . 10 1.6 Testing exogeneity of the endogenous regressors (Hausman test) . . . . . 11 1.7 Testing the identifying conditions . . . . . . . . . . . . . . . . . . . . . . 12 1.7.1 Tests of overidentifying restrictions (Sargan test) . . . . . . . . . 13 1.7.2 Testing the rank condition . . . . . . . . . . . . . . . . . . . . . 15 1.8 Weak instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.8.1 The problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.8.2 Dealing with weak instruments . . . . . . . . . . . . . . . . . . . 19 1.8.3 The Anderson–Rubin (AR) test . . . . . . . . . . . . . . . . . . 20 1.A Suggested (optional) further reading . . . . . . . . . . . . . . . . . . . . 21 2 Generalised method of moments 23 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.1.1 Motivating examples . . . . . . . . . . . . . . . . . . . . . . . . 23 2.1.2 A general framework . . . . . . . . . . . . . . . . . . . . . . . . 25 2.2 Asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.2.1 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.2.2 Asymptotic normality . . . . . . . . . . . . . . . . . . . . . . . . 28 2.2.3 Local identification and weak identification . . . . . . . . . . . . 31 2.3 Asymptotic efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.3.1 The choice of weight matrix . . . . . . . . . . . . . . . . . . . . 32 2.3.2 The implied (efficient) choice of moments . . . . . . . . . . . . . 33 2.3.3 Efficiency in the linear IV model . . . . . . . . . . . . . . . . . . 34 2.4 Tests of over-identifying restrictions . . . . . . . . . . . . . . . . . . . . 35 2.5 Hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.5.1 Tests of nonlinear restrictions and the delta method . . . . . . . 38 2.5.2 GMM criterion-based tests (QLR tests) . . . . . . . . . . . . . . 41 i
2.A Suggested (optional) further reading . . . . . . . . . . . . . . . . . . . . 43 3 Maximum likelihood 45 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.1.1 Parametric and semiparametric estimation . . . . . . . . . . . . . 45 3.1.2 The likelihood function: the general case . . . . . . . . . . . . . 46 3.1.3 The likelihood function: with i.i.d. data . . . . . . . . . . . . . . 47 3.2 Univariate examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2.1 Continuous random variables . . . . . . . . . . . . . . . . . . . . 48 3.2.2 Discrete random variables . . . . . . . . . . . . . . . . . . . . . 50 3.2.3 Mixed continuous/discrete random variables . . . . . . . . . . . . 52 3.3 Models with covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.4 Consistency and identification . . . . . . . . . . . . . . . . . . . . . . . 59 3.4.1 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.4.2 Identification via Kullback–Leibler minimisation . . . . . . . . . . 60 3.5 Asymptotic distribution of the MLE . . . . . . . . . . . . . . . . . . . . 62 3.5.1 Asymptotic normality . . . . . . . . . . . . . . . . . . . . . . . . 62 3.5.2 Efficiency properties . . . . . . . . . . . . . . . . . . . . . . . . 64 3.6 Hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.A Suggested (optional) further reading . . . . . . . . . . . . . . . . . . . . 67 4 References 69 A Mathematical appendix 71 A.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 A.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 A.3 Asymptotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 A.3.1 Modes of stochastic convergence . . . . . . . . . . . . . . . . . 73 A.3.2 Key results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 A.4 Suggested (optional) further reading . . . . . . . . . . . . . . . . . . . . 75 ii
ECONOMETRICS 1, MT 2016 20/04/17 J. A. DUFFY 1 Instrumental variables • Throughout these notes, all variables are i.i.d. unless otherwise stated. 1.1 Introduction • We would like to estimate the parameters β 0 ∈ R d x in y i = x T i β 0 + u i (1.1) (here, as throughout, a ‘ 0 ’ subscript denotes the true value of a parameter). – For various reasons – due to omitted variables, measurement error, unobservable heterogeneity and the like – the usual identifying orthogonality condition E x i u i = 0 may not be considered plausible. – Those elements of x i for which this condition fails are said to be endogenous . • Instead, identification will be achieved by means of the instruments z i , which are assumed to fulfil this same orthogonality condition, IV-ORTH E z i u i = 0 . (Note that x i and z i may share some common elements, but they cannot overlap entirely, for obvious reasons) 1.2 Identification 1.2.1 Rank and order conditions • A parameter is identified if it is uniquely determined from the joint distribution of the data, w i = ( y i , x i , z i ) . (At least in i.i.d. settings, in which the joint distribution can always be consistently estimated, this is equivalent to asking whether the parameter can be consistently estimated.) • Identification of β 0 will follow if the equation 0 = E ( y i − x T i β ) z i = E z i y i − E z i x T i β (1.2) has a unique solution at β = β 0 . The r.h.s. depends only on β and the distribution of w i , through the moments E y i z i and E z i x T i . 1
• To show that β 0 indeed solves (1.2), rewrite as 0 = (1) E ( x T i β 0 + u i − x T i β ) z i = (2) E z i x T i ( β 0 − β ) , (1.3) where = (1) follows by (1.1), and = (2) by IV-ORTH . • Is β = β 0 the only solution to (1.3)? Since E z i x T i is a d z × d x matrix, the equation [ E z i x T i ] δ = 0 admits a solution at some δ � = 0 if and only if rk E z i x T i < d x (see Appendix A.2). In this case, there will be other β ’s, distinct from β 0 , for which (1.3) holds. • A necessary and sufficient condition for identification is thus IV-RANK rk E z i x T i = d x , termed the rank condition (or somewhat more informally, the relevance condition ). – A necessary, but not sufficient condition for the rank condition is that d z ≥ d x , termed the order condition . In other words, there must be at least as many instruments as there are regressors. – The model is said to be exactly identified when d z = d x – i.e. when we have just enough instruments to identify β 0 – and overidentified when d z > d x ; in the latter case, we can test for some violations of IV-ORTH . • In the overidentified case, we have strictly more instruments than are needed to identify β 0 ; and in consequence, the number of such instruments may be reduced, down to d x , without prejudicing identification. • More formally, for some d x × d z matrix L , consider the d x new ‘instruments’ z L,i := Lz i formed by taking d x linear combinations of the original instruments. – z L,i clearly satisfy the required orthogonality condition, since by IV-ORTH 0 = L E z i u i = E z L,i u i . – Similarly, premultiplying (1.2) by L yields the identifying condition 0 = E z L,i y i − E z L,i x T ⇒ 0 = E z L,i x T i β ⇐ i ( β 0 − β ) where the equivalence follows via exactly the same reasoning as which led from (1.2) to (1.3) above. 2
Recommend
More recommend