Outline On comparison and improvement of estimators based on likelihood Aleksander Zaigrajew Nicolaus Copernicus University of Toru´ n B¸ edlewo, November 30, 2016 Zaigrajew XVII Statystyka Matematyczna
Outline Outline 1 A single parameter of interest Problem and Estimators SOR and Admissibility Results Examples 2 Multivariate case Results Examples 3 References Zaigrajew XVII Statystyka Matematyczna
Problem and Estimators A single parameter of interest SOR and Admissibility Multivariate case Results References Examples Maximum likelihood estimator Let a sample x = ( x 1 , x 2 , . . . , x n ) be drawn from an absolutely continuous distribution with an unknown parameter θ ∈ Θ . Let � θ 0 ( x ) be the MLE of θ. In general, � θ 0 ( x ) is not an unbiased estimator of θ. For regular models the bias of the MLE admits the representation (e.g. Cox and Snell, 1968): θ 0 ( x ) − θ = b ( θ ) E � + O ( n − 2 ) , n → ∞ . n The function b ( · ) is called the first-order bias of � θ 0 ( x ) . Following e.g. Anderson and Ray (1975), one can consider the bias-corrected MLE (BMLE): θ 0 ( x ) − b ( � θ 0 ( x )) θ 1 ( x ) = � � . n Zaigrajew XVII Statystyka Matematyczna
Problem and Estimators A single parameter of interest SOR and Admissibility Multivariate case Results References Examples This idea was realized numerically in estimating parameters of different distributions (e.g. Giles, 2012; Schwartz et al., 2013). The BMLE reduces the bias of the MLE: for a consistent estimator � θ 0 ( x ) and a sufficiently smooth function b ( · ) the bias of the BMLE is of order O ( n − 2 ) . Other standard approaches to reduce the bias of the MLE, though via tedious calculations, are the jackknife or bootstrap methods (e.g. Akahira, 1983). The jackknife estimator is given by n � θ 0 − n − 1 � θ J = n � � θ ( i ) , n i =1 θ ( i ) is the MLE of θ based on ( x 1 , . . . , x i − 1 , x i +1 , . . . , x n ) , where � i = 1 , . . . , n . The bias of this estimator is of order O ( n − 2 ) . Let λ be the nuisance parameter (either location or scale). Zaigrajew XVII Statystyka Matematyczna
Problem and Estimators A single parameter of interest SOR and Admissibility Multivariate case Results References Examples Maximum integrated likelihood estimator Let p ( · ; θ, λ ) be the pdf of the distribution considered. In what follows we assume that if λ is location then p ( u + c ; θ, λ ) = p ( u ; θ, λ − c ) ∀ c ; � � if λ is scale then p ( cu ; θ, λ ) = 1 u ; θ, λ c p ∀ c > 0 . c We take into account also the maximum integrated likelihood estimator (MILE) defined as � θ 2 ( x ) ∈ Arg sup θ � L ( θ ; x ) , where � � L ( θ ; x ) = L ( θ, λ ; x ) w ( λ ) d λ. Here L ( θ, λ ; x ) is the likelihood function corresponding to x , while w ( λ ) = 1 /λ, λ > 0 , if λ is the scale parameter, and w ( λ ) ≡ 1 , if λ is the location parameter. The MILE has a reduced bias to compare with the MLE, though it is of order O ( n − 1 ) . Such an estimator was discussed e.g. in Berger et al. (1999), Severini (2007). Zaigrajew XVII Statystyka Matematyczna
Problem and Estimators A single parameter of interest SOR and Admissibility Multivariate case Results References Examples Comparison of estimators We compare the estimators � θ i , i = 0 , 1 , 2 , w.r.t. the second order risk (SOR) based on the mean squared error (MSE) E ( � θ i − θ ) 2 . Consider the class of estimators D = { � θ 0 + d ( � θ 0 ) / n + o p (1 / n ) | d : Θ → R is continuously differentiable } . As it is known (e.g. Rao, 1963), for regular models any first order efficient estimator of θ is inferior to a certain modified MLE � θ 0 + d ( � θ 0 ) / n with the same asymptotic bias structure, that means that any estimator in practical use can be chosen from the class D . The difference in the MSE’s for any two estimators from D is of order O ( n − 2 ) since θ 0 )+ d 2 ( θ ) + 2 b ( θ ) d ( θ ) + 2 d ′ ( θ ) I 11 ( θ ) MSE ( � θ ) = MSE ( � + O ( n − 3 ) , n 2 Zaigrajew XVII Statystyka Matematyczna
Problem and Estimators A single parameter of interest SOR and Admissibility Multivariate case Results References Examples where I 11 ( θ ) is the (1 , 1)-element of the inverse matrix w.r.t. the Fisher information matrix per observation. θ ( i ) ∈ D , i = 1 , 2 , � θ (1) is said to be second order better Given � θ (2) (written � θ (1) ≻ SOR � (SOB) than � θ (2) ) provided n →∞ n 2 � � R ( � θ (1) , � θ (2) ; θ ) = lim MSE ( � θ (1) ) − MSE ( � θ (2) ) � 0 ∀ θ ∈ Θ with strict inequality holding for some θ. If there is an equality here for all θ ∈ Θ , then we say that two estimators are second order θ (1) = SOR � equivalent (SOE) (written � θ (2) ). An estimator � θ ∈ D is called second order admissible (SOA) in D if there does not exist any other estimator in D which is SOB than � θ. If � θ is not SOA, then it is second order inadmissible (SOI). If an estimator from D is SOI, it can be improved to become SOA (Ghosh and Sinha, 1981; Tanaka et al., 2015). Zaigrajew XVII Statystyka Matematyczna
Problem and Estimators A single parameter of interest SOR and Admissibility Multivariate case Results References Examples Ghosh and Sinha (1981) Let θ ∈ Θ be a single parameter, I ( θ ) and b ( θ ) be the continuous functions. Ghosh and Sinha (1981): Theorem. Consider � θ ∈ D with the first-order bias � b ( θ ) = b ( θ ) + d ( θ ) . Then � θ is SOA in D iff for some θ 0 ∈ Θ = ( θ, θ ) � θ I ( θ ) π ( θ ) d θ = ∞ (1) θ 0 � θ 0 and I ( θ ) π ( θ ) d θ = ∞ , (2) θ � � � θ θ 0 � where π ( θ ) = exp − b ( u ) I ( u ) du . θ ∗ ∈ D If � θ is not SOA in D , then there exists the improvement � which is SOB than � θ. Zaigrajew XVII Statystyka Matematyczna
Problem and Estimators A single parameter of interest SOR and Admissibility Multivariate case Results References Examples Ghosh and Sinha (1981) θ 0 + d ∗ ( � θ ∗ = � This improvement � θ 0 ) is SOA in D and n � θ d ∗ ( θ ) = d ( θ ) − π ( θ ) H ( θ ) , H ( θ ) = I ( u ) π ( u ) du , if only (1) is violated; θ � θ d ∗ ( θ ) = d ( θ )+ π ( θ ) H ( θ ) , H ( θ ) = I ( u ) π ( u ) du , if only (2) is violated; θ � � 1 1 d ∗ ( θ ) = d ( θ )+ π ( θ ) H ( θ ) − , if both (1) and (2) are violated . H ( θ ) Zaigrajew XVII Statystyka Matematyczna
Problem and Estimators A single parameter of interest SOR and Admissibility Multivariate case Results References Examples As it was shown in Cox and Snell (1968) (also Zaigraev and Podraza-Karakulska, 2014), for regular models: I 1 i I jk � � 2 2 2 � � � b ( θ ) = G ij , k + G ijk / 2 , i =1 j =1 k =1 � ∂ 2 � ∂ 2 � � ∂θ 2 ln L ( θ, λ ; x 1 ) ∂ ∂θ 2 ln L ( θ, λ ; x 1 ) ∂ G 11 , 1 = E ∂θ ln L ( θ, λ ; x 1 ) , G 11 , 2 = E ∂λ ln L ( θ, λ ; x 1 ) , � ∂ 3 � � � ∂ 3 G 111 = E ∂θ 3 ln L ( θ, λ ; x 1 ) , G 112 = E ∂θ 2 ∂λ ln L ( θ, λ ; x 1 ) , . . . As to the MILE � θ 2 , for regular models (Zaigraev and Podraza-Karakulska, 2014): θ 0 ( x ) + a ( � θ 0 ( x )) θ 2 ( x ) = � � � + O p ( n − 2 ) , n → ∞ = ⇒ θ 2 ∈ D , n � 1 , 2 � I 1 k G 22 k − I 12 1 if λ is scale a ( θ ) = λ · 1 , 1 = 0 , if λ is location. 2 I 22 k =1 Zaigrajew XVII Statystyka Matematyczna
Problem and Estimators A single parameter of interest SOR and Admissibility Multivariate case Results References Examples Comparison of SOR’s For regular models, comparing the SOR’s of two estimators θ (1) = � θ (2) = � � θ 0 + d (1) ( � θ 0 ) / n , � θ 0 + d (2) ( � θ 0 ) / n from D , we get � � 2 � � 2 R ( � θ 1 , � θ 2 ; θ ) = d (1) ( θ ) d (2) ( θ ) − � � � ( d (1) ( θ )) ′ − ( d (2) ( θ )) ′ � d (1) ( θ ) − d (2) ( θ ) + 2 I 11 ( θ ) +2 b ( θ ) . In particular, R ( � θ 2 , � θ 0 ; θ ) = a 2 ( θ ) + 2 a ( θ ) b ( θ ) + 2 I 11 ( θ ) a ′ ( θ ) , R ( � θ 0 , � θ 1 ; θ ) = b 2 ( θ ) + 2 I 11 ( θ ) b ′ ( θ ) , � � θ 1 ; θ ) = ( a ( θ ) + b ( θ )) 2 + 2 I 11 ( θ ) R ( � θ 2 , � a ′ ( θ ) + b ′ ( θ ) . Zaigrajew XVII Statystyka Matematyczna
Problem and Estimators A single parameter of interest SOR and Admissibility Multivariate case Results References Examples Gamma distribution Let a sample be drawn from the gamma Γ( α, σ ) distribution; the shape parameter α > 0 is of interest, the scale parameter σ > 0 is nuisance. Denote g ( α ) = ln α − Ψ( α ) , where Ψ( α ) = (ln Γ( α )) ′ , and R ( x ) = ¯ x / � x , where ¯ x is the arithmetic mean and � x is the geometric mean of x 1 , . . . , x n . α 0 ( x ) is the unique root of the equation g ( α ) = ln R ( x ); � σ 0 ( x ) = ¯ � x / � α 0 ( x ); α 2 ( x ) is the unique root of the equation g ( α ) − g ( n α ) = ln R ( x ); � α 0 ( x ) − b ( � α 0 ( x )) g ′′ ( α ) 1 2( g ′ ( α )) 2 − α 1 ( x ) = � � , b ( α ) = 2 α g ′ ( α ) > 0 n Zaigrajew XVII Statystyka Matematyczna
Recommend
More recommend