sample design and attrition in mcs
play

Sample design and attrition in MCS Tarek Mostafa Centre for - PowerPoint PPT Presentation

Sample design and attrition in MCS Tarek Mostafa Centre for Longitudinal Studies t.mostafa@ioe.ac.uk Outline The MCS sample and design Attrition in the Millennium Cohort Study. What do we know about attrition in MCS? What can we do


  1. Sample design and attrition in MCS Tarek Mostafa Centre for Longitudinal Studies t.mostafa@ioe.ac.uk

  2. Outline • The MCS sample and design • Attrition in the Millennium Cohort Study. What do we know about attrition in MCS? What can we do about it? • Access to MCS documentation. • Data structure and how to merge datasets.

  3. The MCS sample MCS population is defined as All children born between 1 September 2000 and 31 August 2001 (for England and Wales), and between 23 November 2000 and 11 January 2002 (for Scotland and Northern Ireland, see 2.2), alive and living in the UK at age nine months, and eligible to receive Child Benefit at that age. CB was then a universal benefit. and, after nine months: for as long as they remain living in the UK.

  4. The MCS population • The population includes: 1. Children living in non-household situations (women's refuges, hostels, hospitals, prisons etc.) at age nine months in principle – in practice none?. 2. Children not born in the UK but established as resident in the UK at age nine months. • The population excludes: 1. Children who died before age 9 months. 2. UK-born children who emigrated from the UK before 9 months. 3. Children not established as resident in the UK at age nine months- e.g. children of foreign diplomats, asylum seekers etc.

  5. The MCS sample design • The population was stratified by UK country - England, Wales, Scotland and Northern Ireland. • Each country had two strata: advantaged and disadvantaged families. England had an additional one for areas with high percentage of ethnic minorities. • The primary sampling unit is the electoral ward. Small wards with very few births were combined into ‘super-wards’. • Minorities and disadvantaged families were over sampled. • Identified sample: 27201 | Issued sample: 24180 • Productive at wave 1: 18552 • About 692 new families joined MCS in wave 2.

  6. Problems of non-response/attrition • Distinction between unit (respondents’) non response and item non-response (focus on former here) • Types of non-response (have separate reasons) Non-contact Refusal Inability Out of scope/ineligible • Non-response on increase in all surveys • Non-response may not be permanent (in panel survey) • Effects of non-response/attrition

  7. Attrition Definition • Attrition is the discontinued participation of some individuals in a longitudinal survey for reasons that are unknown and/or beyond the control of the researcher

  8. Productive sample over time 20,000 18,552 18,000 15,590 15,246 16,000 13,857 Pructive sample size 13,287 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 Wave 1 Wave 2 Wave 3 Wave 4 Wave 5

  9. Non-response Outcome Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 Productive 18,552 15,590 15,246 13,857 13,287 Not Issued 692 0 0 2,213 2,851 Ineligible 0 167 300 126 78 Untraced Movers 0 687 547 706 388 Refusal 0 1,739 2,315 1,811 2,196 Non-Contact 0 930 546 123 438 Other 0 131 290 408 6 Total 19,244 19,244 19,244 19,244 19,244

  10. Types of non-response 3,000 2,500 2,000 1,500 1,000 500 0 Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 Not Issued Ineligible Untraced Movers Refusal Non-Contact Other

  11. Monotone vs. non-monotone response Monotone response: respondents dropped out without coming back. Non-monotone: interrupted response pattern over time. (New families are special case of non-monotone). Type of non-response Freq. % Monotone 5,023 26.1 Non-monotone 3,773 19.6 All waves 10,448 54.3 Total 19,244 100.0

  12. Sample composition over time: gender 52 51.37 51.5 51.05 51 51 50.61 50.46 50.5 50 49.54 49.39 49.5 49 48.95 49 48.63 48.5 48 47.5 47 Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 Boy Girl

  13. Sample composition over time: class & ‘race ’ 35 30 25 20 15 10 5 0 Wave 1 Wave 2 Wave 3 Wave 4 Wave 5 Non-White Managerial and professional

  14. Effects of non-response/attrition • Missing data Smaller samples, fewer transitions, incomplete histories. • Biases in results (all surveys) Disproportionate to some groups (mobile, disadvantaged, young, men, long working hours) Problem if linked to survey topic focus/variables

  15. Non-response bias - implications • Ignore the problem : equivalent to assuming that there are no sample bias. • Use adjustment techniques such as weighting and imputations.

  16. Sampling and attrition weights • Sampling weights: adjust the sample composition to take account of over-sampling in the first wave. • Attrition weights: adjust the sample composition to take account of the loss of particular type of respondents. • Adjustment means giving more importance (weight) to a particular group. • Overall weights = Sampling Wgt x Attrition Wgt

  17. Attrition weights construction • Logistic response models • Dependent variable: binary response outcome (0/1) in wave M. • Independent variables: characteristics of respondents in previous waves.

  18. MCS weights Variable Overall weights Weights Wave name = Sampling weight (country specific analyses) weights1 Sampling weights x Attrition weights Sampling weight (whole of UK analyses) weights2 Overall weights (country specific analyses) 1 aovwt1 Overall weights (whole of UK analyses) 1 aovwt2 Overall weights (country specific analyses) 2 bovwt1 Weights are prefixed in alphabetical Overall weights (whole of UK analyses) 2 bovwt2 order depending on the wave Overall weights (country specific analyses) 3 covwt1 Overall weights (whole of UK analyses) 3 covwt2 Overall weights (country specific analyses) 4 dovwt1 Overall weights (whole of UK analyses) 4 dovwt2 If you are doing an analysis with an Overall weights (country specific analyses) 5 eovwt1 outcome at wave 4 use the Overall weights (whole of UK analyses) 5 eovwt2 corresponding weight at same wave

  19. Applying survey design in Stata: svy svyset sptn00 [pweight=covwt2], strata(pttype2) fpc(nh2) • Sptn00: Electoral ward ID. • Covwt2: weight (you need to choose the correct one). • Strata: Stratum ID. • Nh2: finite population coefficient.

  20. MCS datasets and access

  21. Documentation • Guide to the Datasets • Questionnaires (CAPI and paper) • Technical reports on sampling, response and fieldwork, Data notes • Data Dictionary • User Guides to Initial Findings (per sweep), to geographical identifiers, psychological scales, derived variables. • Online bibliography

  22. Available datasets: main survey MCS1 MCS2 MCS3 MCS4 MCS5 MCS6 data Age 9 months 3 years 5 years 7 years 11 years 14 years Longitudinal family file X X X X X X Parental interview X X X X X X Household grid X X X X X X Child assessment X X X X X Child measurement X X X Neighbourhood assessment X Older siblings X X Child self completion X X X Consent to data linkage X X X X X X Derived variables X X X X X X

  23. Available datasets: additional data MCS1 MCS2 MCS3 MCS4 MCS5 MCS6 Age 9 months 3 years 5 years 7 years 11 years 14 years Geographical linked data X X X X Foundation stage profile X Teacher survey X X X Birth registration and maternity espisodes X Health visitor survey X Oral fluid X X Activity monitor X X Time use record X Nursery observations Undeposited

  24. Access • Registration for UK based researchers has the following steps: • Apply for a username and password. This can be done on this page: http://www.data-archive.ac.uk/sign-up/credentials-application • Complete an online registration form after logging in. • In the process of downloading the data, they will also be asked to register their project online (30 words). • All that is quick and straight forward.

  25. Access • Website: http://discover.ukdataservice.ac.uk/series/?sn=2000031 • MCS datasets come under three different licence types: 1. End User Licence = easy! 2. Special Access Licence = difficult 3. Secure Access Licence = difficult & impossible for non-UK based researchers.

  26. Access • All aforementioned datasets (with End User Licence) are accessible and downloadable once the users are fully registered. • Special Licence Access: Hospital of Birth: Special Licence Access. Available to non-UK based researcher, but process more difficult. • Secure Access: Access to sensitive information such as geographical identifiers and admin data. Possible to link to many datasets. Data are not downloadable and can be access only via remote desktop.

  27. Secure Access – linking Possible to link area-level Geographical identifiers dataset to MCS: MCS1-MCS4: Ward level level of poverty of their neighbourhood, presence of MCS1-MCS4: Lower Super Output Area services or other amenities, MCS1-MCS4: Output area etc. Education administrative Datasets MCS1-MCS4: Linked Education Administrative Dataset - Scotland MCS1-MCS4: Linked Education Administrative Dataset - Wales MCS1-MCS4: Linked Education Administrative Dataset - England

  28. Data Structure and Linking Datasets

  29. Linking Datasets • Understand the layout of the datasets • How you link depends on what you want – Family - bedrooms in household – Interview - Cohort child outcomes – Respondent - income of mother. – Twins and triplets.

Recommend


More recommend