properties of extremum estimators
play

Properties of Extremum Estimators Asymptotic Theory Part III - PowerPoint PPT Presentation

Properties of Extremum Estimators Asymptotic Theory Part III James J. Heckman University of Chicago Econ 312 This draft, April 12, 2006 As we saw in an earlier lecture (Asymptotic Theory Part II), the Maximum Likelihood Estimator,


  1. Properties of Extremum Estimators Asymptotic Theory — Part III James J. Heckman University of Chicago Econ 312 This draft, April 12, 2006

  2. As we saw in an earlier lecture (Asymptotic Theory — Part II), the Maximum Likelihood Estimator, Nonlinear Least Squares Estimator (NLS) and even the OLS estimator are all examples of “Extremum Estimators”. In this lecture we examine theo- rems and proofs for the consistency and asymptotic normality of Extremum Estimators in a somewhat specialized, but easily generalized, form. 1

  3. The following theorems lay out the conditions under which extremum estimators are consistent and asymptotically nor- mal. They each talk about estimators using the maximum principle, but can trivially be extended to minimum principle estimators by placing a negative sign in front of � ( �� � ) , 1 i.e. min � = max[ � � ] . 1 In this lecture, { � } denotes all the data, and hence includes both dependent and independent variables (corresponding to { � } and { � } in the earlier lecture). 2

  4. 1 Consistency of extremum estima- tors The first theorem proves consistency when the criterion func- tion has a globally unique maximum or minimum, respectively in the population. Thus � is uniquely identified. Di � erentia- bility of � � ( � ) is not required. The second theorem states the additional assumptions you have to make if � is only locally identified, i.e. there are mul- tiple solutions to { max � } but only one is in the neighborhood � ( � 0 ) of � 0 . It assumes di � erentiability of � � ( � ) . 3

  5. � � �� �� Theorem 1 (Global): Assume that 1. Parameter space � is a compact subset of � � ; 2. � � ( �� � ) is continuous in �� � � � , � � and is a measur- able function of �� � � � � ; 3. � � ( �� � ) � � ( � ) , a nonstochastic function, in probability uniformly as � � � ; and 4. � 0 = arg max � � � � ( � ) is globally identified. (i.e. � ( � ) achieves global maximum at � 0 ). If we let ˆ � � = arg max � � � � � ( �� � ) , then: ˆ � � 0 . 4

  6. Observe that continuity of � ( � ) follows from the fact that lim- its of uniformly continuous functions are continuous, and con- tinuity of � in � and compactness of � implies uniform conti- nuity of � ( � ) . 5

  7. � � � � � � � � � � � Proof. Let � ( � 0 ) be an open neighborhood in � � containing � 0 . Then � � ( � 0 ) , the complement of � ( � 0 ) , is closed, so � � � � ( � 0 ) � � is compact. � max � � � � ( � ) exists. Denote � = [ � ( � 0 ) � max � � � � ( � )] � 0 . Let � � be the event = {| � � ( � ) � � ( � ) | � �� 2 } = { � �� 2 � � � ( � ) � � ( � ) � �� 2 } This event is “likely” with � big due to assumption (3) (uni- form convergence of � � to � ), i.e.: pr. uniformly � = � Pr { � � } � 1 as � � � (*) 6

  8. � � � � � � � � � Then � � implies: ³ ´ ³ ´ ˆ ˆ 1. � � �� 2 2. � � ( � 0 ) � � ( � 0 ) � �� 2 ³ ´ ˆ � � � ( � 0 ) by the definition of ˆ Also we have � � � � . Then from the above facts we get: � (ˆ � � ) � � � (ˆ � � ) � �� 2 � � � ( � 0 ) � �� 2 � � ( � 0 ) � �� � � (ˆ � � ) � � ( � 0 ) � � . Since we have a strict inequality, from the definition of � , we get that: � � � { ˆ � � � � ( � 0 ) } � for � su � ciently large. 7

  9. �� � � Then it must be that: Pr { � � } � Pr { ˆ � � � � ( � 0 ) } � Then, from equation (*) we have that: � �� Pr { ˆ lim � �� Pr { � � } = 1 � lim � � � � ( � 0 ) } = 1 and so ˆ � � 0 , because choice of � is arbitrary. 8

  10. �� �� � Theorem 2 (Local): Assume that: 1. Parameter space � is an open subset of � � that contains � 0 ; 2. � � ( �� � ) is a measurable function of � � � � � ; 3. exists and is continuous in an open neighborhood � 1 ( � 0 ) of � 0 (this implies � � is continuous � � � � 1 ( � 0 ) ); 4. There exists an open neighborhood � 2 ( � 0 ) of � 0 such that � � ( �� � ) � � ( � ) , a non-stochastic function, in probabil- ity uniformly � � � � 2 ( � 0 ) as � � � ; and 5. � 0 = arg max � � � 2 ( � 0 ) � ( � ) is locally identified. 9

  11. If we let ˆ � � denote the set of roots of �� � �� = 0 corresponding to the local maxima; then, for any � � 0 � n o � � ˆ lim � �� Pr � � inf | � � � 0 | � 0 = 0 � Proof. See Amemiya, chapter 4. 10

  12. 2 Asymptotic normality of extremum estimators Now we will show that under certain conditions on the first and second derivatives of � , the criterion function for an estimator which uses the extremum principle, the asymptotic distribution of the extremum estimator ˆ � � (chosen as the maximizer of � � ) is normal. 11

  13. � � � � � � Theorem 3 (Cramer): Assume the conditions of Theorem 2, in addition: 1. � 2 � � ���� 0 exists and is continuous in an open neighborhood of � 0 ; 2. There exists an open neighborhood � ( � 0 ) of � 0 such that � � ( �� � ) � � ( � ) , a nonstochastic function, in probabil- ity uniformly � � � � ( � 0 ) as � � � . ¯ 3. � 2 � � ( � ) ¯ ¯ � � ( � 0 ) if � � � � 0 , where ¯ ���� 0 μ � 2 � � ( � ) ¶ � ( � 0 ) = � lim is nonsingular; and ���� 0 � 0 12

  14. � �� �� � � � � � � �� �� � à ! ¯ ¯ �� � ( � ) ¯ 4. � � (0 � � ( � 0 )) , where ¯ " � 0 # ¯ · �� � ( � ) 0 ¯ �� � ( � ) ¯ � ( � 0 ) = � ¯ � 0 If we let ˆ � � denote the root of �� � = 0 , then: ³ ´ ¡ 0 � � ( � 0 ) � 1 � ( � 0 ) � ( � 0 ) � 1 ¢ ˆ � � � � 0 13

  15. � �� �� � � � �� � � � �� �� � � � �� � � � � � � � ¯ ¯ ¯ Proof. By assumption we have: �� � = 0 . ¯ ˆ Then taking a Taylor expansion of the l.h.s. around � 0 , we have ¯ ¯ ¯ ³ ´ + � 2 � � ¯ ¯ ¯ ˆ ¯ ¯ ¯ = �� � + � � (1) , � � � � 0 ¯ ˆ ¯ ¯ ���� 0 � 0 where � � lies between ˆ � � and � 0 . Multiplying by � , we get: ¯ ¯ ³ ´ + � 2 � � ¯ ¯ ˆ ¯ ¯ � � (1) + = 0 � � �� � � � � � 0 ¯ ¯ ���� 0 � 0 Rearranging, we get: ¯ ¯ μ � 2 � � ¶ � 1 � ³ ´ ¯ ¯ ˆ ¯ ¯ = � + � � (1) � � �� � � � � � 0 ¯ ¯ ���� 0 � 0 14

  16. � � � �� � � � � � � � � � � � � � � Since ˆ = � � 0 , we see the first object on the � � 0 r.h.s. becomes: ¯ ¯ � 2 � � � � 2 � � ¯ ¯ ¯ ¯ = � ( � 0 ) � ¯ ¯ ���� 0 ���� 0 � 0 where � ( � 0 ) is constant. As for the second object on the r.h.s., by assumption, à ! ¯ ¯ �� � ( � ) ¯ � � (0 � � ( � 0 )) � ¯ � 0 Putting this all together we have, by Slutsky’s Theorem, ³ ´ ¡ ¢ ˆ 0 � � ( � 0 ) � 1 � ( � 0 ) � ( � 0 ) � � � � 0 15

  17. � � � � �� �� �� � � � � Observe that assumption (4) is a consequence of a uniform central limit theorem. à ! à � ! ¯ X ¯ �� � ( � ) 1 ¯ = � � ( � ) ¯ � =1 � 0 i.i.d. random variables with mean zero and we norm them by � . We get, by a CLT, that the variance of this random variable is � μ �� � ( � ) ¶ μ �� � ( � ) ¶ 0 ¸ · 16

  18. References [1] Amemiya, Advanced Econometrics , 1985, chapter 4. [2] Newey and McFadden, Large Sample Estimation and Hy- pothesis Testing, in Handbook of Econometrics , 1994, chap- ter 36, Volume IV. 17

Recommend


More recommend