Harnessing new social data for effective social policy and service delivery Official statistics and “new” sources of data OECD Workshop – Paris, October 16 th 2019 Pascal Rivière INSEE - Head, General inspectorate
Introduction • On one hand : considerable amount of data are now available, and this number seems to grow exponentially ; demands also grow quickly • On the other hand : National Statistical Institutes (NSIs) build develop aggregated information based on … surveys (and some administrative sources) • How to improve our data collection ? – New types of surveys ? – Or couldn’t we simply use the available data and respond more quickly and efficiently to requests ?
Official statistics • Providing statistics on a regular basis, with some requirements : Representing a population – Comparability through time, between countries – Quality requirements (e.g. Eurostat requirements) – Use of standard classifications (e.g. NACE : classification of economic activities) – • Standard approach Sampling, questionnaire design, data collection, data editing, handling for – nonresponses, variance calculation … and a whole theory supporting the process • This process ... Is more and more expensive (response rates decreasing) (Meyer & al 2015) – Gives the impression of being slow and tedious – But “new” ways : web surveys, mix-mode surveys, admin sources, big data ? –
Administrative sources : “new” social data • Many kinds of administrative sources Registers of individuals, business registers, tax returns, social security data, health – data, education data... • Administrative data often derived from administrative declarations This increases quality (compulsory declarations, …) (Rivière 2018) – • More recent uses : Statistics at a more detailed level – Linking different themes by linking sources (e.g. health and socio-economic – characteristics) • New opportunities ...and new problems Instability, statistical units (household vs person), lack of control over definitions, lack – of exhaustiveness, operational use
Linking administrative sources : new horizons • Need for a general framework for statistics based on administrative sources (Hand 2018) Register-based statistics (Wallgren 2007) – • Record linkage With or without ID : obviously far more complex if no common ID – Need for preservation of privacy if ID – Dynamic field of research (many papers in the last 5 years) – Reference books : (Herzog & al 2007, Christen 2012, Winkler 2015) – Requires an infrastructure and a dedicated environment → no super database – • Statistical offices, but also administrative world (for example unified administrative declarations in France or Belgium), – academic world (US, UK, Italy, Netherlands, German Record Linkage Center, CASD in France, ...) – • In some NSIs, global vision with a general data strategy on social data based on administrative sources and surveys : Typically Australia, Canada, Netherlands (Bakker & al 2014) –
Examples in the french case • Recent evolutions post-2010 • Permanent demographic sample : links demographic data, economic data, social data on 4% of the population (sample drawn according to day of birth) Sources : civil registry data, tax data, data from social nominative declaration – Examples of papers : differential mortality due to social class, standard of living – Coming soon (2020), health data will be linked – • Secured Data Access Center (CASD) (Perignon & al, 2019, Science) : Remote access to individual data – High level of security – Record linkage tools – Many uses by researchers throughout Europe (more recently US and Canada) – International Data Access Network –
Evolution of data collection modes in National Statistical Institutes • Face-to-face surveys : gold standard – high quality Expensive but essential in some cases (e.g. long and complex questionnaires) – • Telephone surveys : development in the 90s, varying a lot from a country to another – • Web surveys, beginning in the 00s Advantages / drawbacks compared to phone surveys – • Mixed-mode surveys (combining different modes) (Dillman & al 2014) • Administrative data, for different purposes : Replacing a survey, checking data quality, combining with other modes, calibrating – • Big data, main cases Satellite data (agriculture), mobile phone data, cash register data (price indexes) – Big data can be used in some cases as a complement but not as the main source of data –
Particular surveys • Some surveys on sensitive topics require particular methodology • Homeless (Yaouancq & al 2014) Surveys with two sampling steps : sample of services (shelters, social assistance services), and then – sample of homeless people who use these services Types of surveys depend on the objective : social characteristics of the population, or simply counting – homeless people The analysis shows a significant proportion of people who received social assistance before reaching the – age of majority • Living environment and safety survey (Cadre de vie et sécurité) - Face-to-face interview - Survey to find out about the acts of delinquency to which households and their members may have been victims - The survey also includes people's opinions on security, particularly in their living environment, and in particular to measure their "feeling of insecurity" → separate module on serious violence was converted into a self-administered computerized questionnaire
As a conclusion • The NSIs are very familiar with the survey methodology, which makes it possible to control the quality of statistics with proven but sometimes slow and/or expensive professional techniques. • Their collection methods have evolved over time to meet these new challenges and have been able to adapt to specific situations • New data sources are significantly changing the context • The possibility of linking an ever-increasing number of administrative sources with each other or with surveys is the most promising possibility • Big data in general is more a complementary possibility than a real new collection method
Thank you
Recommend
More recommend