NTTS 2017 Domain Estimation of Survey Discontinuities Nikos Tzavidis 1 Joint work with Paul Smith (University of Southampton), Timo Schmid, Natalia Rojas-Perilla, (Freie Universit¨ at Berlin), Jan van den Brackel (CBS & University of Maastricht), Silvia Manclossi, Chris McGowan & Lisa Walters (Welsh Government) NTTS Conference Brussels, March 13 - 17 2017 1 Southampton Statistical Sciences Research Institute, University of Southampton (n.tzavidis@soton.ac.uk) Domain Estimation of Survey Discontinuities
NTTS 2017 What is a Survey Discontinuity? ◮ Surveys try to maintain consistent methodologies (sampling/survey design) over time ◮ Aids the comparability of survey estimates over time ◮ However, changes in design cannot be avoided ◮ Changes designed to increase efficiency/reduce costs ◮ Can create breaks in the series known as discontinuities Domain Estimation of Survey Discontinuities
NTTS 2017 An Example: The National Survey for Wales ◮ The Welsh Government (WG) has reviewed the way in which social surveys are conducted in Wales ◮ WG instituted a new National Survey (NSn) from 2016 ◮ The NSn collects information previously collected in 5 surveys ◮ The old National Survey (NSo) ◮ The Welsh Health Survey (WHS) ◮ The Active Adults Survey (AAS) ◮ The Arts in Wales Survey (AWS) ◮ The Welsh Outdoor Recreation Survey (WORS) Domain Estimation of Survey Discontinuities
NTTS 2017 Reviewing Survey Operations: The National Survey for Wales ◮ Agreeing the NSn involved consultations with customers ◮ NSn -like the NSo- uses a rotating design ◮ Longer questionnaire but not all original questions included ◮ Sufficient sample for LAs and Welsh Health Boards ◮ Demonstrate that new methodology is appropriate ◮ Produce estimates of discontinuities ◮ Estimates at National/sub-national levels Domain Estimation of Survey Discontinuities
NTTS 2017 Potential Sources of Discontinuities ◮ Several changes in the NSn are potentially important ◮ Change of contractor ◮ Mode: Telephone/ self-completed to face-to-face ◮ Interviewer effects - Social acceptability (sensitive questions) ◮ Questions from 5 surveys combined in a single questionnaire ◮ Possible ordering and context effects ◮ Impact of new design on response propensity by subgroup Domain Estimation of Survey Discontinuities
NTTS 2017 A Framework for Assessing Discontinuities ◮ WG put in place a large-scale pilot of the new design ◮ Similar design to the one used in the NSn ( n = 2800) ◮ Discontinuities: Difference between the estimates from the old surveys and those from the pilot ◮ Account for sampling variance ◮ Focus on discontinuities greater than 5 percentage points (Government Statistical Service - Methodology Advisory Committee, 2016) Domain Estimation of Survey Discontinuities
NTTS 2017 Assumptions ◮ Assumption 1: The time difference between the pilot and the old surveys can be ignored ◮ Assumption 2: The pilot is used as if it were the new survey ◮ Ideally estimate discontinuities by a split-sample experiment ◮ Old and new designs randomly administered to respondents ◮ This is not the case with the Welsh pilot survey ◮ We cannot say why discontinuities occur Domain Estimation of Survey Discontinuities
NTTS 2017 Estimating Discontinuities with a Pilot Survey National level ◮ Denote by H-T the Horvitz -Thompson estimator of θ θ O is the H-T estimator of θ from the old survey ◮ ˆ θ N is the H-T estimator of θ from the pilot survey ◮ ˆ ◮ Estimator of Discontinuity: ˆ D = ˆ θ N − ˆ θ O ◮ Var ( ˆ D ) = Var (ˆ θ N − ˆ θ O ) Domain Estimation of Survey Discontinuities
NTTS 2017 Estimating Discontinuities with a Pilot Survey Domain Level ◮ ˆ θ O k direct estimator from the old survey in domain k ◮ ˆ θ N k direct estimator from the pilot survey in domain k ◮ Direct estimator of discontinuity: ˆ D k = ˆ k − ˆ θ N θ O k ◮ Variance of estimated discontinuity possibly large due to the small sample size of the pilot ◮ Employ model-based estimation Domain Estimation of Survey Discontinuities
NTTS 2017 Model-based Estimation of Discontinuities ◮ Area-level model (Fay & Herriot, 1979, JASA; Van den Brakel et al., 2016, JRSS A) ˆ k ˆ θ N k = x T β + v k + ǫ k v k ∼ N (0 , σ 2 v ); ǫ k ∼ N (0 , ψ k ) ˆ γ k ˆ k ˆ θ EBLUP θ N γ k ) x T = ˆ k + (1 − ˆ β k ◮ Estimated Discontinuity: ˆ k = ˆ − ˆ D M θ EBLUP θ O k k Domain Estimation of Survey Discontinuities
NTTS 2017 Practical Challenges with Model-based Estimation ◮ Sampling variance ψ k is assumed known/estimated accounting for the complex sampling design ◮ Var ( ˆ k ) = MSE (ˆ ) + Var (ˆ k ) − 2 Cov (ˆ k , ˆ D M θ EBLUP θ O θ N θ O k ) k ◮ Estimating Cov (ˆ k , ˆ θ N θ O k ) is complex (Van den Brackel, 2016) ◮ Model covariates can be taken from survey or admin data Domain Estimation of Survey Discontinuities
NTTS 2017 Practical Challenges with Model-based Estimation ◮ Model covariates from surveys treated as random variables ◮ Extend the area-level model to account for measurement error in the predictors (Yabarra & Lohr, 2008, Biometrika) ˆ γ k ˆ k ˆ θ ME − EBLUP θ N x T = ˆ k + (1 − ˆ γ k )ˆ β k σ 2 v + β T V k β γ k = σ 2 v + β T V k β + ψ k ◮ V k is the variance covariance matrix of x ◮ Estimating V k is another practical challenge Domain Estimation of Survey Discontinuities
NTTS 2017 Experimental Results for the National Survey for Wales ◮ We present selected anonymised results ◮ Ranges of direct point estimates (Horvitz-Thompson) of National discontinuities ◮ Plots of model and direct point estimates for domains ◮ Model-based estimates produced with the F-H model ◮ ψ k variance of H -T estimator ◮ Domains defined by crossing areas with demographic groups Domain Estimation of Survey Discontinuities
NTTS 2017 Experimental Results for the National Survey for Wales Ranges of Significant (National) Discontinuities - Direct Range Surveys Survey 1 -0.108, -0.058 Survey 2 -0.111, 0.166 Survey 3 -0.082, 0.110 Survey 4 -0.152, 0.184 Domain Estimation of Survey Discontinuities
NTTS 2017 Experimental Results for the National Survey for Wales - Variable 1 Discontinuities − FH Model 0.4 0.2 ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.2 −0.4 Discontinuities − Direct 0.4 0.2 ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.2 ● ● ● ● −0.4 0 10 20 30 40 Domains Domain Estimation of Survey Discontinuities
NTTS 2017 Experimental Results for the National Survey for Wales - Variable 2 Discontinuities − FH Model 0.4 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.2 Discontinuities − Direct 0.4 ● 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.2 0 10 20 30 40 Domains Domain Estimation of Survey Discontinuities
NTTS 2017 Comments ◮ For this example, consistent negative estimates of discontinuities for some key variables ◮ A number of variables show discontinuities that are larger than the nominally interesting five percentage points ◮ Evidence that an adjustment for continuity should be made ◮ Pattern of discontinuities differential by subgroup (domain) ◮ Width of 95% CI is smaller for model-based estimates Domain Estimation of Survey Discontinuities
NTTS 2017 A Word of Caution ◮ Model-based estimate: Combine model & direct estimates ◮ Model-based estimates are affected by shrinkage ◮ Although variance is reduced, bias can increase ◮ Use model-based estimates cautiously ◮ Contrast to direct estimates ◮ Accounting for measurement error may reduce shrinkage Domain Estimation of Survey Discontinuities
NTTS 2017 The Impact of Measurement Error in Covariates ◮ Shrinkage Factor - F-H Model σ 2 v γ k = σ 2 v + ψ k ◮ Shrinkage Factor - F-H Model with Measurement Error σ 2 v + β T V k β γ ME = k σ 2 v + β T V k β + ψ k < 1 ⇒ γ ME γ k > γ k ◮ γ ME k k ◮ Higher weight to the direct estimator under the ME model Domain Estimation of Survey Discontinuities
NTTS 2017 Current Research Alternative modelling frameworks ◮ Approach 1: Model the discontinuity directly ◮ Approach 2: Multivariate F-H model - Joint modelling of direct estimates from the old and pilot surveys Further topics ◮ Implementing the measurement error F-H model ◮ Benchmarking of domain discontinuities to national estimates Domain Estimation of Survey Discontinuities
Recommend
More recommend