small area estimation to quantify discontinuities in
play

Small area estimation to quantify discontinuities in sample surveys - PowerPoint PPT Presentation

Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Small area estimation to quantify discontinuities in sample surveys Jan A. van den Brakel 1 2 Bart Buelens 1 Harm-Jan Boonstra 1


  1. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Small area estimation to quantify discontinuities in sample surveys Jan A. van den Brakel 1 2 Bart Buelens 1 Harm-Jan Boonstra 1 First Asian ISI Satellite Meeting on Small Area Estimation, Bangkok, Thailand, 1-4 September 2013. 1 Statistics Netherlands, Department of Statistical Methods 2 Maastricht University, Department of Quantitative Economics

  2. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Outline Introduction 1 Small area estimators 2 Model selection 3 Analyzing discontinuities 4 Results discontinuities 5 Conclusions 6

  3. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Introduction Survey → measurement error: y k , i = u k , i + b i + e k , i Survey redesign → affects measurement error: b i Discontinuities: ∆ i = y ( a ) − y ( r ) i i Quantification through a parallel run: y ( r ) Regular survey full sample size: direct estimators ˆ i Alternative sample reduced sample size: y ( a ) small area estimation for ˜ i Additional auxiliary information: direct estimates regular y ( r ) survey ˆ i

  4. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Introduction This paper: y ( r ) Direct estimates regular survey ˆ as additional i information in models for small area estimators Variance estimation discontinuities: y ( r ) y ( a ) y ( r ) y ( a ) var ( ˆ ∆ i ) = var (ˆ ) + var (˜ ) − 2 cov (ˆ , ˜ ) i i i i

  5. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Introduction Redesign Crime Victimization Survey (CVS) in 2008: Regular (new) survey design (ISM): Stratified simple random sampling, with 25 police regions as strata Sample size: 19000 responses (about 750 per domain) y ( r ) GREG estimator domains: ˆ i Alternative (old) survey (NSM): Stratified simple random sampling, with 25 police regions as strata Sample size: 6000 responses (proportional allocation) y ( a ) SAE domains: ˜ i

  6. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Small area estimators Auxiliary information Municipal Basic Administration (gender, age, household size, nationality, urbanization, municipality, province, etc.) Police Register of Reported Offences Direct estimates target variable and related variables from the regular survey Direct estimates preceding editions of the survey Area level model (Fay and Herriot, 1979): y ( a ) y ( a ) + e i = z t ˆ = i β + v i + e i , i i iid ind ∼ N ( 0 , σ 2 v i v ) , e i ∼ N ( 0 , ψ i )

  7. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Small area estimators EBLUP for auxiliary variables with error 1 (Ybarra and Lohr, 2008) y ( a ) y ( a ) i ˆ z t ˜ γ i ˆ γ i )ˆ = ˆ + ( 1 − ˆ β, i i β t � v + ˆ z i )ˆ σ 2 cov (ˆ ˆ β ˆ = γ i , β t � v + ˆ z i )ˆ σ 2 ˆ cov (ˆ β + ψ i Standard EBLUP (Rao, 2003) 2 y ( a ) y ( a ) γ i ) z t i ˆ ˜ = γ i ˆ ˆ + ( 1 − ˆ β, i i σ 2 ˆ v ˆ = γ i , σ 2 ˆ v + ψ i Hierarchical Bayesian approach (Rao, 2003, Section 10.3). 3 Posterior mean and variance for the area level model with a flat prior on β and σ 2 v .

  8. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Model selection Procedure: Step forward variable selection Criterion: conditional AIC Penalty: trace of the ”hat” matrix ˆ y = Hy Percentage improvement in coefficient of variation of the HB estimates compared to the direct estimates for optimal models based on different sets of covariates. variable admin + Police register + ISM 47% 49% 56% offtot 24% 29% 37% unsafe 29% 35% 51% nuisance 50% 50% 55% satispol 49% 49% 51% propvict

  9. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Model selection Optimal models variable cAIC-based model offtot REG victim unsafe REG nuisance, ADM benefit, PR propcrim, PR drugs nuisance REG nuisance, ADM old satispol REG funcpol propvict PR propcrim, ADM old All models also include an intercept (not shown). REG * : direct estimate regular survey PR * : Police Register of Reported Offences ADM * : Municipal Basic Administration

  10. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Analyzing discontinuities y ( r ) y ( a ) Discontinuity: ˆ ∆ i = ˆ − ˜ i i y ( r ) y ( a ) y ( r ) y ( a ) Variance var ( ˆ ∆ i ) = var (ˆ ) + MSE (˜ ) − 2 cov (ˆ , ˜ ) . i i i i Problem: y ( a ) y ( a ) i ˆ z t ˜ γ i ˆ γ i )ˆ = ˆ + ( 1 − ˆ β i i z i and ˆ ˆ β contain survey estimates from the regular survey (same target variable or related variables): y ( r ) y ( a ) Design correlation between ˆ and ˜ i i i ˆ Nonlinear term: ˆ z t β

  11. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Analyzing discontinuities y ( r ) y ( a ) Covariance cov (ˆ , ˜ ) : i i β around z i and y ( a ) i ˆ z t First order Taylor approximation for ˆ . i y ( r ) y ( a ) Approximation for cov (ˆ , ˜ ) : i i β t +ˆ i ˆ i ˆ y ( r ) z t T − 1 ˆ z i )ˆ γ i (ˆ θ i − ˆ β t ˆ z t T − 1 ] � ( 1 − ˆ γ i )[( 1 − ˆ γ i ˆ z i )ˆ cov (ˆ i , ˆ z i ) , with: T = � m t = � m β = ˆ ˆ ˆ T − 1 ˆ ˆ z i ˆ γ i ˆ z i ˆ z t γ i ˆ t , i = 1 ˆ i , i = 1 ˆ θ i y ( r ) y ( r ) cov (ˆ � i , ˆ z i ) : vector with design covariances between ˆ i and ˆ z i

  12. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Analyzing discontinuities y ( r ) y ( a ) y ( r ) y ( a ) var ( ˆ ∆ i ) = var (ˆ ) + MSE (˜ ) − 2 cov (ˆ , ˜ ) . i i i i y ( a ) MSE (˜ ) : i Posterior variance of the HB estimator 1 Design-based approximation: 2 y ( a ) around z i and y ( a ) Taylor approximation for ˜ i i y ( a ) Approximation for MSE (˜ ) : i   � m � m y ( a ) ˆ z j )ˆ ˆ y ( a ) γ 2 γ i ) 2  B t C 2  i � var (ˆ B i , j � cov (ˆ i , j � var (ˆ ˆ ) + ( 1 − ˆ i , j + ) i j j = 1 j = 1 γ i )ˆ y ( a ) C i , i � var (ˆ + 2 ˆ γ i ( 1 − ˆ ) , i with β t + ˆ y ( a ) ˆ i ˆ z j )ˆ j ˆ i ˆ z t T − 1 ˆ z t z t T − 1 , B i , j = ( δ i , j − ˆ γ j ˆ γ j (ˆ − ˆ β )ˆ j ˆ i ˆ z t T − 1 ˆ ˆ C i , j = z j ˆ γ j .

  13. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Analyzing discontinuities Three estimators for y ( r ) y ( a ) y ( r ) y ( a ) var ( ˆ ∆ i ) = var (ˆ ) + MSE (˜ ) − 2 cov (ˆ , ˜ ) : i i i i y ( a ) Posterior variance of the HB estimator for MSE (˜ ) 1 i y ( a ) Design-based approximation for MSE (˜ ) 2 i Bootstrap approximation 3 Draw repeatedly bootstrap samples from the original sample (regular and alternative sample) y ( r ) y ( a ) Calculate ˆ ∆ i , b = ˆ i , b − ˜ i , b , b = 1 , . . . , B � B ∆ i , b − ¯ var ( ˆ b = 1 ( ˆ ˆ � ∆ i ) = 1 ∆ i ) 2 B

  14. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Results discontinuities Comparison HB point and SE estimates with bootstrap results averaged over districts. variable Analytic Bootstrap HB est. SE(1) SE(2) HB est. SE 33.21 2.43 2.90 33.29 3.13 offtot 19.83 1.76 1.64 19.84 1.92 unsafe 1.29 0.06 0.08 1.28 0.08 nuisance 55.09 3.00 2.54 55.29 3.58 satispol 9.85 1.09 0.84 9.88 1.12 propvict

  15. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Results discontinuities Analysis results discontinuities averaged over districts. variable Analytic Bootstrap GREG Disc. SE(1) SE(2) Disc. SE Disc. SE 9.08 3.54 3.92 9.02 4.92 9.01 7.69 offtot 4.55 2.54 2.46 4.54 2.69 4.52 3.57 unsafe 0.33 0.05* 0.07 0.33 0.11 0.33 0.17 nuisance 5.52 4.98 4.72 5.33 5.43 5.04 8.21 satispol 2.70 1.95 1.84 2.70 1.97 2.78 2.77 propvict *: For nuisance 2 districts with negative variance estimates for the estimated discontinuity are truncated at zero.

  16. Introduction Small area estimators Model selection Analyzing discontinuities Results discontinuities Conclusions Conclusions Additional information regular survey useful for SAE models (substantial reduction standard errors) Variance approximations: Design-based covariance approximation Design-based approximation MSE of SAE predictions Avoids negative variance estimates Alternative (further research): bivariate area level model

Recommend


More recommend