an interval estimation approach to selection bias in
play

An Interval Estimation Approach to Selection Bias in Observational - PowerPoint PPT Presentation

An Interval Estimation Approach to Selection Bias in Observational Studies Matt Tudball 1 with Rachael Hughes 1 , Kate Tilling 1 , Qingyuan Zhao 2 and Jack Bowden 1 1 MRC Integrative Epidemiology Unit, University of Bristol 2 Department of


  1. An Interval Estimation Approach to Selection Bias in Observational Studies Matt Tudball 1 with Rachael Hughes 1 , Kate Tilling 1 , Qingyuan Zhao 2 and Jack Bowden 1 1 MRC Integrative Epidemiology Unit, University of Bristol 2 Department of Statistics, Wharton School, University of Pennsylvania 20 June, 2019 Matt Tudball (MRC IEU) Selection Bias in Obs. Studies 20 June, 2019 1 / 11

  2. Motivating Problem UK Biobank is a population cohort widely analysed by epidemiologists, health economists, clinicians, etc. During recruitment in 2006, only 500,000 of 9.2 million invited individuals subsequently enrolled in the cohort (i.e. response rate of 5.5%). Follow-up studies show that participants tend to be better educated, higher earners, lower mortality, etc. compared to the UK population. This is called the ‘healthy volunteer’ effect. Matt Tudball (MRC IEU) Selection Bias in Obs. Studies 20 June, 2019 2 / 11

  3. Motivating Problem If we knew or could estimate each individual’s probability of entering the sample, we could perform inverse probability weighting . This adjusts our sample to be more representative of the population from which it is drawn. However, in UK Biobank, we do not observe any individual-level data on people who did not select into the sample, so this approach is not possible. Matt Tudball (MRC IEU) Selection Bias in Obs. Studies 20 June, 2019 3 / 11

  4. Existing Literature Aronow and Lee (2013) (AL) propose a method which provides an interval of possible inverse probability weighted sample means in settings like this. The key assumption is that each individual’s probability of sample selection lies between two user-specified constants, a and b . For example, a = 1% and b = 90%. The method works by finding configurations of individual-level weights which produce the biggest and smallest sample means, given the assumption above. A big advantage of this method is that it is fully non-parametric and allows selection on unobservables. Matt Tudball (MRC IEU) Selection Bias in Obs. Studies 20 June, 2019 4 / 11

  5. Limitations of AL No applied papers have been written which use the AL method. We believe there are 4 key reasons for this: 1) The AL method is limited to population means. 2) AL did not propose a procedure for conducting statistical inference. That is, there are no confidence intervals or hypothesis tests. 3) The bounds are often implausibly wide for reasonable choices of a and b , making interpretation difficult. 4) They assume no knowledge of the selection mechanism or population from which the sample is drawn. Matt Tudball (MRC IEU) Selection Bias in Obs. Studies 20 June, 2019 5 / 11

  6. Summary of our Method Our method builds on the AL estimator but addresses its limitations. 1) Our method works for a wide variety of estimands, including OLS and IV. 2) We propose and validate two approaches to valid confidence intervals and hypothesis tests: one based on the percentile bootstrap and one based on the asymptotic distribution of stochastic programs. 3) We show how to force the weights to be consistent with population-level information, thus tightening the bounds, sometimes significantly. Matt Tudball (MRC IEU) Selection Bias in Obs. Studies 20 June, 2019 6 / 11

  7. Summary of our Method: 3) We consider 3 main types of population-level information: 1) Survey response rate: We can force the optimising weights to imply the the response rate for the survey, which is typically known to researchers. 2) Population means: We can also force the optimising weights to imply known population means of variables in our sample. For example, we may want the weights to imply a male proportion of 50%. 3) Parametric assumptions: We can impose a parametric form on the weights and choose variables within our sample which we believe are predictive of selection. We then optimise over the parameters of the function. Matt Tudball (MRC IEU) Selection Bias in Obs. Studies 20 June, 2019 7 / 11

  8. Applied Example: Education on Income We estimate the effect of leaving school later than age 15 on the likelihood of earning more than £ 31,000 per annum in UK Biobank (Davies et al, 2018). We use the 1972 ROSLA as an instrumental variable. We use a 12 month bandwidth and control for sex and month-of-birth indicators. We assume a logit specification for the weights as a function of household income over £ 31,000, years of education, days of physical activity per week and sex. Matt Tudball (MRC IEU) Selection Bias in Obs. Studies 20 June, 2019 8 / 11

  9. Applied Example: Education on Income We consider four specifications: Constraint 1) a = 0 . 1% and b = 50% Constraint 2) Above plus constraining the direction of the selection effects. That is, we assume education, income and physical activity positively influence selection, while being male negatively influences selection. Constraint 3) Above plus constraining the response rate to be 5.5%. Constraint 4) All above plus constraining the proportion of males to be 49.5%. Matt Tudball (MRC IEU) Selection Bias in Obs. Studies 20 June, 2019 9 / 11

  10. Applied Example: Education on Income Unweighted estimate Constraint 1) Constraint 2) Constraint 3) Constraint 4) [0.001, 0.5] 0.0 0.2 0.4 0.6 0.8 Interval Estimates Matt Tudball (MRC IEU) Selection Bias in Obs. Studies 20 June, 2019 10 / 11

  11. Conclusion This is a flexible sensitivity analysis which works for a variety of estimands. The assumptions can be selected by the researcher, ranging from a fully non-parametric selection-on-unobservables model to a parametric selection-on-(within-sample) observables model. It also allows researchers to incorporate a suite of population-level information to tighten the bounds. Confidence intervals and hypothesis tests are available as well. * Paper will be up on arXiv very soon! Matt Tudball (MRC IEU) Selection Bias in Obs. Studies 20 June, 2019 11 / 11

Recommend


More recommend