numerical comparison among different empirical prediction
play

NUMERICAL COMPARISON AMONG DIFFERENT EMPIRICAL PREDICTION INTERVALS - PowerPoint PPT Presentation

NUMERICAL COMPARISON AMONG DIFFERENT EMPIRICAL PREDICTION INTERVALS Masayo Yoshimori Research Fellow of JSPS, Graduate School of Engineering Science, Osaka University (The research was conducted under the supervision of Professor Partha Lahiri


  1. NUMERICAL COMPARISON AMONG DIFFERENT EMPIRICAL PREDICTION INTERVALS Masayo Yoshimori Research Fellow of JSPS, Graduate School of Engineering Science, Osaka University (The research was conducted under the supervision of Professor Partha Lahiri at the University of Maryland, College Park.) September 4th, 2013 Small Area Estimation (2013) at Bangkok September 4th, 2013 1 / 20

  2. Outline Empirical Bayes estimator under the Fay-Herriot model 1 Confidence Interval 2 Simulation study 3 Conclusion 4 Small Area Estimation (2013) at Bangkok September 4th, 2013 2 / 20

  3. Empirical Bayes estimator under the Fay-Herriot model The Fay Herriot Bayesian Model Ref: Fay and Herriot (JASA, 1979) For i = 1 , · · · , m , Level 1: (Sampling Distribution) : y i | θ i ∼ N ( θ i , D i ); Level 2: (Prior Distribution) : θ i ∼ N ( x ′ i β , A ) where m : number of small area; y i : direct survey estimate of θ i ; θ i : true mean for area i ; x i : p × 1 vector of known auxiliary variables; D i : known sampling variance of the direct estimate; The p × 1 vector of regression coefficients β and model variance A are unknown. Small Area Estimation (2013) at Bangkok September 4th, 2013 3 / 20

  4. Empirical Bayes estimator under the Fay-Herriot model Bayes Estimator of θ i The purpose is to predict a true mean for i area, θ i When model variance A is known, the following Bayes estimator of θ i is obtained by minimizing MSE (ˆ θ i ) among all linear unbiased predictors of θ i , where MSE (ˆ θ i ) = E [(ˆ θ i − θ i ) 2 ] and E is the expectation with respect to the Fay-Herriot model: θ B ˆ i ˆ i = (1 − B i ) y i + B i x ′ β , where D i B i ≡ B i ( A ) = A + D i β ≡ ˆ ˆ β ( A ) = ( X ′ V − 1 X ) − 1 X ′ V − 1 y where V ≡ V ( A ) = diag ( A + D 1 , · · · , A + D m ). Small Area Estimation (2013) at Bangkok September 4th, 2013 4 / 20

  5. Empirical Bayes estimator under the Fay-Herriot model Empirical Bayes (EB) Estimator of θ i Let model variance ˆ A be a consistent estimator of A , for large m . An EB of θ i is given by ˆ = (1 − ˆ B i ) y i + ˆ i ˆ θ EB B i x ′ β . i where ˆ D i B i = ˆ A + D i β = ˆ ˆ β (ˆ A ) Ref: Efron and Morris (JASA, 1975), Fay and Herriot (JASA, 1979) Small Area Estimation (2013) at Bangkok September 4th, 2013 5 / 20

  6. Confidence Interval Confidence Interval for θ i An interval, denoted by I i , is called a 100(1 − α )% interval for θ i if P ( θ i ∈ I i | β, A ) = 1 − α, ∀ β ∈ R p , A ∈ R + , where the probability P is with respect to the joint distribution of { ( y i , θ i ) , i = 1 , · · · , m } under the Fay-Herriot model; R + is the positive part of the real line. Small Area Estimation (2013) at Bangkok September 4th, 2013 6 / 20

  7. Confidence Interval A General Form of Confidence Interval for θ i Most of the intervals proposed in the literature can be written as: (ˆ τ i (ˆ θ i ) , ˆ τ i (ˆ θ i + q 1 ( α )ˆ θ i + q 2 ( α )ˆ θ i )) where ˆ θ i is an estimator of θ i ; τ i (ˆ θ i ) is an estimate of the measure of uncertainty of ˆ ˆ θ i ; q 1 ( α ) and q 2 ( α ) are chosen suitably in an effort to attain coverage probability close to the nominal level 1 − α . Small Area Estimation (2013) at Bangkok September 4th, 2013 7 / 20

  8. Confidence Interval Direct Confidence Interval The choice ˆ θ i = y i leads to the direct interval I D given by i I D � : y i ± z α/ 2 D i , i where z α/ 2 is the upper 100(1 − α/ 2)% point of N (0 , 1). Remarks : The coverage probability is 1 − α ; When D i is large, the length is too large to make any reasonable conclusion. Small Area Estimation (2013) at Bangkok September 4th, 2013 8 / 20

  9. Confidence Interval Synthetic Confidence Interval Ref: Hall and Maiti (JRSS, 2006) � � i ˆ ˆ i ˆ ˆ ( x ′ A , x ′ β + q 1 ( α ) β + q 2 ( α ) A ) where ˆ A are consistent estimators of A . For example, residual maximam likelihood estimator (REML). L ∗ i [ q 2 ( α )] − L ∗ i [ q 1 ( α )] = 1 − α where L ∗ i is a parametric bootstrap i ˆ approximation of the distribution L i of θ i − x ′ β √ A . ˆ Remarks : The method is synthetic (Rao 2005). This approach could be useful in situations especially when y i is missing for the i th area. Small Area Estimation (2013) at Bangkok September 4th, 2013 9 / 20

  10. Confidence Interval Bayesian Credible Interval Assume β and A are known. i ( A ) : ˆ I B θ B i ( A ) ± z α/ 2 σ i ( A ) , where ˆ i ≡ ˆ θ B θ B i ( A ) = (1 − B i ) y i + B i x ′ i β, D i B i ≡ B i ( A ) = D i + A , � AD i σ i ( A ) = A + D i Remarks : θ i | y i ; β, A ∼ N [ˆ i ( A ) , g 1 i = σ 2 θ B i ( A )] . The Bayesian credible interval cuts down the length of the direct confidence interval by 100 × (1 − √ 1 − B i )% The maximum benefit from the Bayesian methodology is achieved when B i is near 1. Small Area Estimation (2013) at Bangkok September 4th, 2013 10 / 20

  11. Confidence Interval Empirical Bayes Confidence Interval Ref: Cox (1975) (ˆ A ) : ˆ (ˆ A ) ± z α/ 2 σ (ˆ I Cox θ EB A ) , i i where y = m − 1 � m x T i β = µ is estimated by the sample mean ¯ i =1 y i and A by the ANOVA estimator: y ) 2 − D , 0 ˆ ( m − 1) − 1 � m � � A ANOVA = max i =1 ( y i − ¯ . Remarks : The length of the Cox interval is smaller than that of the direct interval. The distribution of θ i − ˆ θ EB i is not a standard Normal. Thus, it is not σ (ˆ A ) appropriate to use the Normal quantile z α/ 2 as the cut-off points. The Cox empirical Bayes confidence interval introduces a coverage error of the order O ( m − 1 ), not accurate enough in most small area applications. length of the interval is zero when ˆ A ANOVA = 0 Small Area Estimation (2013) at Bangkok September 4th, 2013 11 / 20

  12. Confidence Interval Other EB Confidence Intervals Replace σ (ˆ A ) by a measure of uncertainty that captures uncertainty due to 1 estimation of the hyperparameters β and A (e.g., √ g 1 i + g 2 i + 2 g 3 i ) (Ref: Morris (JASA, 1983) Prasad and Rao (JASA, 1990)) Replace z α/ 2 by z α/ 2 c i (ˆ A ) to reduce the coverage error to O ( m − 1 . 5 ) (Datta 2 et al., Scand. Stat. 2002; Basu et al. 2003; Sasase and Kubokawa, JRSS., 2005; Yoshimori, Comm. Stat., 2013) Parametric bootstrap (Laird and Louis, JASA 1987; Carlin and Louis 1996; 3 Chatterjee et al., AS 2008) Small Area Estimation (2013) at Bangkok September 4th, 2013 12 / 20

  13. Confidence Interval Parametric Bootstrap Confidence Interval Ref: Chatterjee, Lahiri and Li (AS, 2008) i − ˆ to approximate the distribution of θ i − ˆ Use the distribution of θ ∗ θ EB ∗ θ EB i A ) . i σ i (ˆ σ i (ˆ A ∗ ) Compute ˆ β and ˆ A ; Draw bootstrap sample from the following bootstrap model: ind (i) y ∗ i | θ ∗ ∼ N ( θ ∗ i , D i ) i ind i ˆ β, ˆ (ii) θ ∗ ∼ N ( x ′ A ) i β ∗ and ˆ A ∗ from y ∗ . Then we have ˆ Compute ˆ = (1 − ˆ i + ˆ i ˆ ′ θ EB ∗ B ∗ ) y ∗ B ∗ x β ∗ , i i (ˆ A ∗ D i and σ 2 A ∗ ) = A ∗ + D i ; i − ˆ ) /σ i (ˆ θ EB ∗ Compute ( θ ∗ A ∗ ). i Remarks : When REML estimates gets zero, we need to truncated by some small values. Small Area Estimation (2013) at Bangkok September 4th, 2013 13 / 20

  14. Confidence Interval Parametric Bootstrap Confidence Interval Parametric Bootstrap Confidence Interval = [ˆ + q 1 ( α ) σ i (ˆ A ) , ˆ + q 2 ( α ) σ i (ˆ CI PB θ EB θ EB A )] , i i i where L ∗ i [ q 2 ( α )] − L ∗ i [ q 1 ( α )] = 1 − α, and L ∗ i is a parametric bootstrap approx. of the distribution of θ i − ˆ θ EB A ) . i σ i (ˆ Theorem Under reg. cond. Pr ( θ i ∈ CI PB ) = 1 − α + O ( m − 1 . 5 ) , i Small Area Estimation (2013) at Bangkok September 4th, 2013 14 / 20

  15. Confidence Interval A Research Question Which of the confidence intervals one should use when REML is used to estimate A ? Restricted Maximum Likelihood estimator (REML estimator) | X ′ V − 1 ( A ) X | − 1 / 2 | V | − 1 / 2 exp {− 1 ˆ 2 y ′ Py } × K , 0 } A RE = max { arg max 0 < A < ∞ where K is a generic constant free from A and P ≡ P ( A ) = V − 1 − V − 1 X ( X ′ V − 1 X ) − 1 X ′ V − 1 . Small Area Estimation (2013) at Bangkok September 4th, 2013 15 / 20

  16. Simulation study Simulation set-up: The Fay-Herriot Model with Unequal Sampling Variances m = 15 , 45, x ′ i β = 0, A = 1 There are two patterns of sampling variance D i ; Pattern (a) { 0 . 7 , 0 . 5 , 0 . 4 , 0 . 3 } , Pattern (b) { 20 , 6 , 5 , 4 , 2 } . (When REML estimate gets zero, we truncated it as 0 . 01.) CLL :the parametric bootstrap confidence interval (Chatterjee et al, 2008); HM :Synthetic Confidence interval (Hall and Maiti, 2006); Cox :Cox empirical confidence interval (Cox, 1975); PR :the method which is used second order unbiased estimator of MSE (Prasad and Rao, 1990); Y :the method, which z α/ 2 is replaced by z α/ 2 c i (ˆ A ) for some c i , (Under the Fay-Herriot model, Yoshimori, 2003). Small Area Estimation (2013) at Bangkok September 4th, 2013 16 / 20

Recommend


More recommend