canada data and resources
play

Canada Data and Resources Hugh McCague Valerie Preston Walter - PowerPoint PPT Presentation

Accessing Statistics Canada Data and Resources Hugh McCague Valerie Preston Walter Giesbrecht Sara Tumpane Outline Survey Terminology Research Data Centre (RDC) RDC versus Public Use Microdata Files (PUMF) Accessing the RDC


  1. Accessing Statistics Canada Data and Resources Hugh McCague Valerie Preston Walter Giesbrecht Sara Tumpane

  2. Outline • Survey Terminology • Research Data Centre (RDC) • RDC versus Public Use Microdata Files (PUMF) • Accessing the RDC • Statistics Canada Surveys and Data • Statistical Software • Research Opportunities • Statistical Consulting Service • Resources

  3. Some Survey Terminology • Population • Elements • Sample: Simple Random Sample, Probability Sample • Response Rate • Weights: Simple Weights 3

  4. Some Survey Terminology • Demographics • Strata • Clusters (primary sampling units, PSUs) • Complex Sample • Complex Weights, Bootstrap and Jackknife Replicate Weights 4

  5. Some Survey Terminology • Cross-sectional data • Longitudinal data : periods, waves, cycles, trajectory, life course • Attrition : attrition rate. • Helpful reference : Ornstein, Michael. A Companion to Survey Research . London; Thousand Oaks, CA: SAGE, 2013. 5

  6. Research Data Center (RDC) • Access to Statistics Canada data and statistical software • Microdata & administrative data • For York students and faculty, access is free • A “secure” environment • Researchers are “deemed employees” of Statistics Canada • Must work in RDC • CRDCN Network

  7. The CRDCN Network

  8. York RDC • 282 York Lanes • Staffed by: Analyst Sara Tumpane (yorkrdc2@yorku.ca) • Assistant Theresa Kim (yorkrdc3@yorku.ca) • • 8 workstations • Open 3-3.5 days/ wk • http://www.isr.yorku.ca/rdc / 8

  9. Before you apply to the RDC… • Consider your options • Is what you need in some more readily accessible source (either PUMF or aggregate file)

  10. RDC or PUMF? Confidential Microdata in Research Public Use Microdata Files accessed Data Centres online Characteristics: Characteristics: o Contains most of the original o Manipulated by aggregating, information collected during the capping, or deleting variables that survey could be “identifiers”; survey o Continuous variables are accessible respondents cannot be identified o Longitudinal identifiers provided o Many continuous variables o Contains bootstrap weights used for transformed into categorical calculating exact variance variables o Longitudinal identifiers stripped Access is appropriate when: Access is appropriate when: o Sensitive variables not provided in o Immediate data access is required o Analysis is for a course paper or PUMF o A PUMF does not exist equivalent o Longitudinal data is necessary o Data exploration o Analytical work is complex in nature

  11. CCHS 2012 Example 1 PUMF Master File • 1815 variables • 1381 variables • Sources of personal income • Sources of personal income o wages and salaries o Employment inc. o income from self-employment o o EI/Worker's comp dividends and interest o employment insurance o Senior benefits o worker's compensation o Other o CPP or QPP o job related retirement pensions o RRSP/RRIF o OAS and GIS o social assistance/welfare o child tax benefits o child support o alimony o other o none

  12. CCHS 2012 Example 2 PUMF Master File • Geography • Geography o Province of residence of respondent o Province of residence of respondent-(G) o Postal code - (D) o Health Region - (G) o Health region of residence of respondent - (D) o B.C. Health Authority (BCHA) - (D) o Sub-health region (Québec only) - (D) o Nova Scotia district health authority o British Columbia local health authority - (D) o Regional health authority (RHA) - Alberta - (D) o British Columbia health authority - (D) o Local health integrated networks - Ontario - (D) o 2006 census dissemination area o Federal electoral district - (D) o Census subdivision - (D) o Census division - (D) o Statistical area classification type - (D) o 2006 Census metropolitan area (CMA) o Health region peer group o Urban and rural areas o Urban and rural areas - 2 levels - (D) o Subzones for Alberta o Manitoba health authority - (D)

  13. Accessing PUMFs & master file metadata • Statistics Canada Nesstar data portal o metadata only, for PUMFs and master files o http://www62.statcan.ca/webview/ • YUL: Data & Statistics library guide o http://researchguides.library.yorku.ca/data • <odesi> (OCUL) o http://www.library.yorku.ca/e/resolver/id/1165738

  14. http://www.andertoons.com/data/cartoon/6543/things-good-stuff-ok-i-reiterate-request-for-specific-data

  15. How to apply to an RDC and available datasets • RDC Application Pages • SSHRC Website • Data available in the RDCs

  16. Accessing the RDC Action Timeline Notes Provide list of academic Apply through the 1-2 Hours contributions; 5-10 page SSHRC website project proposal Approval based on Evaluation of the relevance of methods and 2-4 Weeks data, and demonstrated proposal need for microdata Security screening 1-3 Weeks for approval process Sign Microdata Research 1-3 Weeks for approval Contract

  17. Project Proposal • The project proposal is a maximum of ten pages and includes the following elements: o Title of the Project o Rationale and objectives of the study o Proposed data analysis and software requirements o Data requirements o Expected project start and end dates o Expected products o References

  18. Data at the RDC • Canadian Community Health Survey (CCHS) : 2001-2014 o Health status, health care utilization, and health determinants • Annual Component (starting in 2001, N~130,000) • Mental Health (2002, 2012) N ~ 37,000 • Nutrition (2004) N ~ 35,000 • Healthy Aging (2008-2009) N ~ 52,000 (sample 45+) • Canadian Health Measures Survey (CHMS) : 2011, 2012, 2013 o Survey and administrative data • Hate Crime Data (Pilot) : 2010-2012 o Characteristics of hate-motivated criminal incidents, victims, and accused persons

  19. Data (continued) • General Social Survey (GSS) : 1985-2014 • Health (1985, 1991) • Time Use (1986, 1992, 1998, 2005, 2010) • Victimization (1988, 1993, 1999, 2004, 2009, 2014) • Education, Work and Retirement (1989, 1994) • Family (1990, 1995, 2001, 2006, 2011) • Caregiving and Care Receiving (1996, 2002, 2007, 2012) • Access to and Use of Information Technology (2000) • Social Networks/Social Identity (2003, 2008, 2013) • Giving, Volunteering and Participating (2013) • National Longitudinal Survey of Children and Youth (NLSCY) : 8 cycles o Development and well-being: birth - early adulthood o Follow-ups every two years to age 25

  20. Data by Themes • Health and Health Care • National Population Health Survey (NPHS) • Participation and Activity Limitation Survey (PALS) • Canadian Tobacco, Alcohol and Drugs Survey (CTADS) • Occupations and Organizations • Workplace and Employee Survey (WES) • Survey of Labour and Income Dynamics (SLID) • Census • Education • Youth in Transition Survey (YITS) • National Graduates Survey (NGS) • Race and Ethnicity • Aboriginal Peoples Survey (APS) • Longitudinal Survey of Immigrants to Canada (LSIC) • Ethnic Diversity Survey (EDS)

  21. Pilot Data • Canadian Cancer Registry (CCR) • Vital Statistics • Uniform Crime Reporting • Homicide Survey • Hate Crime Data • Ministry of Community and Social Services (MCSS) • Citizenship and Immigration Canada (CIC)

  22. Which Statistical Software to use at the York RDC? Features to Consider • SPSS 23 • SAS 9.4 • Stata 13 • R 3.0.3 Statistical Software Resources: Institute for Digital Research and Educations (idre), UCLA http://www.ats.ucla.edu/stat/

  23. An Example of a Psychology Research Project at the York RDC • Ames, M. E., Rawana J. S., Gentile P., and Morgan A. S. “The protective role of optimism and self-esteem on depressive symptom pathways among Canadian Aboriginal youth .” Journal of Youth and Adolescence 44.1 (2013): 142-154. • National Longitudinal Study of Children and Youth • Complex Sample Design, Post-Stratification • Longitudinal Linear Mixed Models with Mediation 23

  24. A Few of Many Quantitative Methods Research Opportunities • Extending methods to Complex Samples Designs • Proper methods for the Structural Equation Modeling of Complex Survey Data are strongly needed (Bollen et al., 2013) • R package laavan.survey has started to address this issue (Oberski, 2014) • Item Response Theory with Complex Survey Data needs much more development (Cyr and Davies, 2005) 24

  25. Statistical Consulting Service (SCS ) • Statistical Consulting provided by a group of York faculty and graduate students with staff at the Institute for Social Research (ISR). • Usually, no fee for York faculty and student researchers • Online appointment scheduler 25

  26. http://truthfacts.com/truthfacts/2014/04/09

  27. Statistical Consulting Service (SCS ) • ISR/SCS Short Courses and Spring Seminar Series on data analysis, qualitative research methods, survey methods, and related software • More details: http://www.isryorku.ca/centres/scs/ 27

  28. Contact Information and Resources • http://www.isryorku.ca/qmforum

Recommend


More recommend