Accessing Statistics Canada Data and Resources Hugh McCague Valerie Preston Walter Giesbrecht Sara Tumpane
Outline • Survey Terminology • Research Data Centre (RDC) • RDC versus Public Use Microdata Files (PUMF) • Accessing the RDC • Statistics Canada Surveys and Data • Statistical Software • Research Opportunities • Statistical Consulting Service • Resources
Some Survey Terminology • Population • Elements • Sample: Simple Random Sample, Probability Sample • Response Rate • Weights: Simple Weights 3
Some Survey Terminology • Demographics • Strata • Clusters (primary sampling units, PSUs) • Complex Sample • Complex Weights, Bootstrap and Jackknife Replicate Weights 4
Some Survey Terminology • Cross-sectional data • Longitudinal data : periods, waves, cycles, trajectory, life course • Attrition : attrition rate. • Helpful reference : Ornstein, Michael. A Companion to Survey Research . London; Thousand Oaks, CA: SAGE, 2013. 5
Research Data Center (RDC) • Access to Statistics Canada data and statistical software • Microdata & administrative data • For York students and faculty, access is free • A “secure” environment • Researchers are “deemed employees” of Statistics Canada • Must work in RDC • CRDCN Network
The CRDCN Network
York RDC • 282 York Lanes • Staffed by: Analyst Sara Tumpane (yorkrdc2@yorku.ca) • Assistant Theresa Kim (yorkrdc3@yorku.ca) • • 8 workstations • Open 3-3.5 days/ wk • http://www.isr.yorku.ca/rdc / 8
Before you apply to the RDC… • Consider your options • Is what you need in some more readily accessible source (either PUMF or aggregate file)
RDC or PUMF? Confidential Microdata in Research Public Use Microdata Files accessed Data Centres online Characteristics: Characteristics: o Contains most of the original o Manipulated by aggregating, information collected during the capping, or deleting variables that survey could be “identifiers”; survey o Continuous variables are accessible respondents cannot be identified o Longitudinal identifiers provided o Many continuous variables o Contains bootstrap weights used for transformed into categorical calculating exact variance variables o Longitudinal identifiers stripped Access is appropriate when: Access is appropriate when: o Sensitive variables not provided in o Immediate data access is required o Analysis is for a course paper or PUMF o A PUMF does not exist equivalent o Longitudinal data is necessary o Data exploration o Analytical work is complex in nature
CCHS 2012 Example 1 PUMF Master File • 1815 variables • 1381 variables • Sources of personal income • Sources of personal income o wages and salaries o Employment inc. o income from self-employment o o EI/Worker's comp dividends and interest o employment insurance o Senior benefits o worker's compensation o Other o CPP or QPP o job related retirement pensions o RRSP/RRIF o OAS and GIS o social assistance/welfare o child tax benefits o child support o alimony o other o none
CCHS 2012 Example 2 PUMF Master File • Geography • Geography o Province of residence of respondent o Province of residence of respondent-(G) o Postal code - (D) o Health Region - (G) o Health region of residence of respondent - (D) o B.C. Health Authority (BCHA) - (D) o Sub-health region (Québec only) - (D) o Nova Scotia district health authority o British Columbia local health authority - (D) o Regional health authority (RHA) - Alberta - (D) o British Columbia health authority - (D) o Local health integrated networks - Ontario - (D) o 2006 census dissemination area o Federal electoral district - (D) o Census subdivision - (D) o Census division - (D) o Statistical area classification type - (D) o 2006 Census metropolitan area (CMA) o Health region peer group o Urban and rural areas o Urban and rural areas - 2 levels - (D) o Subzones for Alberta o Manitoba health authority - (D)
Accessing PUMFs & master file metadata • Statistics Canada Nesstar data portal o metadata only, for PUMFs and master files o http://www62.statcan.ca/webview/ • YUL: Data & Statistics library guide o http://researchguides.library.yorku.ca/data • <odesi> (OCUL) o http://www.library.yorku.ca/e/resolver/id/1165738
http://www.andertoons.com/data/cartoon/6543/things-good-stuff-ok-i-reiterate-request-for-specific-data
How to apply to an RDC and available datasets • RDC Application Pages • SSHRC Website • Data available in the RDCs
Accessing the RDC Action Timeline Notes Provide list of academic Apply through the 1-2 Hours contributions; 5-10 page SSHRC website project proposal Approval based on Evaluation of the relevance of methods and 2-4 Weeks data, and demonstrated proposal need for microdata Security screening 1-3 Weeks for approval process Sign Microdata Research 1-3 Weeks for approval Contract
Project Proposal • The project proposal is a maximum of ten pages and includes the following elements: o Title of the Project o Rationale and objectives of the study o Proposed data analysis and software requirements o Data requirements o Expected project start and end dates o Expected products o References
Data at the RDC • Canadian Community Health Survey (CCHS) : 2001-2014 o Health status, health care utilization, and health determinants • Annual Component (starting in 2001, N~130,000) • Mental Health (2002, 2012) N ~ 37,000 • Nutrition (2004) N ~ 35,000 • Healthy Aging (2008-2009) N ~ 52,000 (sample 45+) • Canadian Health Measures Survey (CHMS) : 2011, 2012, 2013 o Survey and administrative data • Hate Crime Data (Pilot) : 2010-2012 o Characteristics of hate-motivated criminal incidents, victims, and accused persons
Data (continued) • General Social Survey (GSS) : 1985-2014 • Health (1985, 1991) • Time Use (1986, 1992, 1998, 2005, 2010) • Victimization (1988, 1993, 1999, 2004, 2009, 2014) • Education, Work and Retirement (1989, 1994) • Family (1990, 1995, 2001, 2006, 2011) • Caregiving and Care Receiving (1996, 2002, 2007, 2012) • Access to and Use of Information Technology (2000) • Social Networks/Social Identity (2003, 2008, 2013) • Giving, Volunteering and Participating (2013) • National Longitudinal Survey of Children and Youth (NLSCY) : 8 cycles o Development and well-being: birth - early adulthood o Follow-ups every two years to age 25
Data by Themes • Health and Health Care • National Population Health Survey (NPHS) • Participation and Activity Limitation Survey (PALS) • Canadian Tobacco, Alcohol and Drugs Survey (CTADS) • Occupations and Organizations • Workplace and Employee Survey (WES) • Survey of Labour and Income Dynamics (SLID) • Census • Education • Youth in Transition Survey (YITS) • National Graduates Survey (NGS) • Race and Ethnicity • Aboriginal Peoples Survey (APS) • Longitudinal Survey of Immigrants to Canada (LSIC) • Ethnic Diversity Survey (EDS)
Pilot Data • Canadian Cancer Registry (CCR) • Vital Statistics • Uniform Crime Reporting • Homicide Survey • Hate Crime Data • Ministry of Community and Social Services (MCSS) • Citizenship and Immigration Canada (CIC)
Which Statistical Software to use at the York RDC? Features to Consider • SPSS 23 • SAS 9.4 • Stata 13 • R 3.0.3 Statistical Software Resources: Institute for Digital Research and Educations (idre), UCLA http://www.ats.ucla.edu/stat/
An Example of a Psychology Research Project at the York RDC • Ames, M. E., Rawana J. S., Gentile P., and Morgan A. S. “The protective role of optimism and self-esteem on depressive symptom pathways among Canadian Aboriginal youth .” Journal of Youth and Adolescence 44.1 (2013): 142-154. • National Longitudinal Study of Children and Youth • Complex Sample Design, Post-Stratification • Longitudinal Linear Mixed Models with Mediation 23
A Few of Many Quantitative Methods Research Opportunities • Extending methods to Complex Samples Designs • Proper methods for the Structural Equation Modeling of Complex Survey Data are strongly needed (Bollen et al., 2013) • R package laavan.survey has started to address this issue (Oberski, 2014) • Item Response Theory with Complex Survey Data needs much more development (Cyr and Davies, 2005) 24
Statistical Consulting Service (SCS ) • Statistical Consulting provided by a group of York faculty and graduate students with staff at the Institute for Social Research (ISR). • Usually, no fee for York faculty and student researchers • Online appointment scheduler 25
http://truthfacts.com/truthfacts/2014/04/09
Statistical Consulting Service (SCS ) • ISR/SCS Short Courses and Spring Seminar Series on data analysis, qualitative research methods, survey methods, and related software • More details: http://www.isryorku.ca/centres/scs/ 27
Contact Information and Resources • http://www.isryorku.ca/qmforum
Recommend
More recommend