Probability Sampling Approach to Editing Maiki Ilves 1 , Prof. Thomas Laitila 2 1 Department of Statistics, ¨ Orebro University, Sweden 2 Department of Statistics, ¨ Orebro University and Statistics Sweden.
Introduction The role of editing: 1. To assess the quality of data 2. To improve the survey by identifying error sources 3. Correct errors Probability Sampling Approach to Editing – p.1/16
Different ways of editing Traditional micro-editing Automated editing Selective editing Macro-editing Probability Sampling Approach to Editing – p.2/16
Selective editing - 1 Purpose: prioritize suspicious responses according to their influence to the survey estimates and edit only the most influential responses. ▽ Probability Sampling Approach to Editing – p.3/16
Selective editing - 1 Purpose: prioritize suspicious responses according to their influence to the survey estimates and edit only the most influential responses. Three stages : ▽ Probability Sampling Approach to Editing – p.3/16
Selective editing - 1 Purpose: prioritize suspicious responses according to their influence to the survey estimates and edit only the most influential responses. Three stages : 1. Find out suspicious responses - editing rules ▽ Probability Sampling Approach to Editing – p.3/16
Selective editing - 1 Purpose: prioritize suspicious responses according to their influence to the survey estimates and edit only the most influential responses. Three stages : 1. Find out suspicious responses - editing rules 2. Prioritize - score function i.e. function of measured value and expected amended value. Local score, global score. ▽ Probability Sampling Approach to Editing – p.3/16
Selective editing - 1 Purpose: prioritize suspicious responses according to their influence to the survey estimates and edit only the most influential responses. Three stages : 1. Find out suspicious responses - editing rules 2. Prioritize - score function i.e. function of measured value and expected amended value. Local score, global score. 3. Determine cut-off point - in simulation study based on fully edited dataset Probability Sampling Approach to Editing – p.3/16
Selective editing - 2 Evaluation: relative pseudo-bias � � θ q − ˆ ˆ θ 100 � � � � se (ˆ � � θ 100 ) � � q - percentage of suspicious responses pursued. Probability Sampling Approach to Editing – p.4/16
Selective editing - 3 Advantages + Reduced costs + Reduced response burden + Gain in timeliness ▽ Probability Sampling Approach to Editing – p.5/16
Selective editing - 3 Advantages + Reduced costs + Reduced response burden + Gain in timeliness Disadvantages - How to take into account the effect of editing in the estimation stage? - Influence of edited data when used in different statistical analysis is not known. - So far used only on quantitative variables. Probability Sampling Approach to Editing – p.5/16
Estimating measurement bias Literature: Madow (1965), Lessler and Kalsbeck (1992), Rao and Sitter (1997) Bias estimation through double sampling or two-phase sampling. For all subsampled units the true values are recorded and the difference between true values and observed values is used for bias estimation. Probability Sampling Approach to Editing – p.6/16
Probability sampling approach Our idea: Combine selective editing with bias estimation and derive unbiased estimator and its variance for this approach. ▽ Probability Sampling Approach to Editing – p.7/16
Probability sampling approach Our idea: Combine selective editing with bias estimation and derive unbiased estimator and its variance for this approach. U ▽ Probability Sampling Approach to Editing – p.7/16
Probability sampling approach Our idea: Combine selective editing with bias estimation and derive unbiased estimator and its variance for this approach. ✬ ✩ U s a ✫ ✪ ▽ Probability Sampling Approach to Editing – p.7/16
Probability sampling approach Our idea: Combine selective editing with bias estimation and derive unbiased estimator and its variance for this approach. ✬ ✩ U 1 U 2 U s a 1 s a 2 s a ✫ ✪ ▽ Probability Sampling Approach to Editing – p.7/16
Probability sampling approach Our idea: Combine selective editing with bias estimation and derive unbiased estimator and its variance for this approach. ✬ ✩ U 1 U 2 U ✬✩ s a 1 s a 2 s a s 2 ✫✪ ✫ ✪ Probability Sampling Approach to Editing – p.7/16
Unbiased estimator for edited data Notation: z k , k ∈ U 1 - true value x k , k ∈ U 2 - observed value y k = I edit z k + (1 − I edit ) x k , k ∈ U - observed value k k after selective editing ▽ Probability Sampling Approach to Editing – p.8/16
Unbiased estimator for edited data Notation: z k , k ∈ U 1 - true value x k , k ∈ U 2 - observed value y k = I edit z k + (1 − I edit ) x k , k ∈ U - observed value k k after selective editing We want to estimate t z = � U z k . ▽ Probability Sampling Approach to Editing – p.8/16
Unbiased estimator for edited data Notation: z k , k ∈ U 1 - true value x k , k ∈ U 2 - observed value y k = I edit z k + (1 − I edit ) x k , k ∈ U - observed value k k after selective editing We want to estimate t z = � U z k . HT-estimator ˆ t y = � s a y k /π ak is biased. ▽ Probability Sampling Approach to Editing – p.8/16
Unbiased estimator for edited data Notation: z k , k ∈ U 1 - true value x k , k ∈ U 2 - observed value y k = I edit z k + (1 − I edit ) x k , k ∈ U - observed value k k after selective editing We want to estimate t z = � U z k . HT-estimator ˆ t y = � s a y k /π ak is biased. Estimator of bias is e k ˆ � B (ˆ t y ) = , e k = x k − z k . π ak π k | s a 2 s 2 ▽ Probability Sampling Approach to Editing – p.8/16
Unbiased estimator for edited data Notation: z k , k ∈ U 1 - true value x k , k ∈ U 2 - observed value y k = I edit z k + (1 − I edit ) x k , k ∈ U - observed value k k after selective editing We want to estimate t z = � U z k . HT-estimator ˆ t y = � s a y k /π ak is biased. Estimator of bias is e k ˆ � B (ˆ t y ) = , e k = x k − z k . π ak π k | s a 2 s 2 t y − ˆ Bias corrected estimator is ˆ t z = ˆ B (ˆ t y ) . Probability Sampling Approach to Editing – p.8/16
Precision of the estimators t y ) + B 2 (ˆ MSE (ˆ V (ˆ t y ) = t y ) . t y ) + V ( ˆ t y , ˆ MSE (ˆ V (ˆ B (ˆ t y )) − 2 C (ˆ B (ˆ t z ) = t y )) ▽ Probability Sampling Approach to Editing – p.9/16
Precision of the estimators t y ) + B 2 (ˆ MSE (ˆ V (ˆ t y ) = t y ) . t y ) + V ( ˆ t y , ˆ MSE (ˆ V (ˆ B (ˆ t y )) − 2 C (ˆ B (ˆ t z ) = t y )) where y k y l � � V (ˆ t y ) = ∆ akl , (1) π ak π al U ▽ Probability Sampling Approach to Editing – p.9/16
Precision of the estimators t y ) + B 2 (ˆ MSE (ˆ V (ˆ t y ) = t y ) . t y ) + V ( ˆ t y , ˆ MSE (ˆ V (ˆ B (ˆ t y )) − 2 C (ˆ B (ˆ t z ) = t y )) where e k e l V ( ˆ � � B (ˆ t y )) = ∆ akl + (2) π ak π al U 2 �� � � e k e l + E a ∆ kl | s a 2 I ak I al , π ak π k | s a 2 π al π l | s a 2 U 2 ▽ Probability Sampling Approach to Editing – p.9/16
Precision of the estimators t y ) + B 2 (ˆ MSE (ˆ V (ˆ t y ) = t y ) . t y ) + V ( ˆ t y , ˆ MSE (ˆ V (ˆ B (ˆ t y )) − 2 C (ˆ B (ˆ t z ) = t y )) where y k e l t y , ˆ � � C (ˆ B (ˆ t y )) = ∆ akl . π ak π al U U 2 Probability Sampling Approach to Editing – p.9/16
One example One specific two-phase design is considered. First phase sampling design: SI of size n a , second phase sampling design: Poisson with inclusion probability π k | s a 2 . ▽ Probability Sampling Approach to Editing – p.10/16
One example Then, ˆ CS 2 V (ˆ t y ) = ys a , � � 1 V ( ˆ ˆ S 2 � e 2 B (ˆ t y )) = C es 2 + (1 − π k | s a 2 )ˇ , ˇ k N − n a s 2 �� � C 1 ˆ t y , ˆ � � C (ˆ B (ˆ t y )) = x k ˇ e k − y k ˇ e k , n a n a − 1 s 2 s a s 2 where C = (1 − f a ) N 2 , ˇ e k = e k /π k | s a 2 and n a e k ) 2 ) . S 2 e 2 es 2 = 1 / ( n a − 1)( � k − 1 /n a ( � s 2 ˇ s 2 ˇ ˇ Probability Sampling Approach to Editing – p.10/16
Simulation study: purpose To compare survey estimates under two editing approaches: ▽ Probability Sampling Approach to Editing – p.11/16
Simulation study: purpose To compare survey estimates under two editing approaches: Approach 1 - editing procedure where selective editing procedure is applied; ▽ Probability Sampling Approach to Editing – p.11/16
Simulation study: purpose To compare survey estimates under two editing approaches: Approach 1 - editing procedure where selective editing procedure is applied; Approach 2 - editing procedure where in addition to selective editing bias correction is carried out. Probability Sampling Approach to Editing – p.11/16
Simulation study: setup Population size: N = 10000 ▽ Probability Sampling Approach to Editing – p.12/16
Simulation study: setup Population size: N = 10000 Sample size: n a = 1000 ▽ Probability Sampling Approach to Editing – p.12/16
Simulation study: setup Population size: N = 10000 Sample size: n a = 1000 True values: z ∼ Po (5) ▽ Probability Sampling Approach to Editing – p.12/16
Recommend
More recommend