Improving the quality of national survey data in South Africa through digital data collection Mahier Hattas 1 Johan Breytenbach 2 Abstract Digital data collection (DDC) offers national statistical organizations (NSOs) in Africa possible, albeit partial, solutions to several current performance and profitability concerns. Perceived potential benefits of DDC methods over paper based collection methods include increased speed of data collection, increased data accuracy, timeous data availability, higher data quality, increased security of data, and lower costs of data collection. Secondary benefits may include better-informed policies from governmental departments reliant on NSO’s for strate gic data. This article presents data from two iterations of a large scale DDC implementation in South Africa, whereby aspects related to collection speed, data accuracy, availability, quality, and costs of data collection receive attention. The implications of this research will affect the standard generic statistical value chain for the collection of household surveys. Findings include inter alia: poor initial speed of DDC interviews followed by a significant speed increase as interviewers master DDC technology and skills, the importance of effective training within DDC processes, proof of higher accuracy in geographic data capturing, real time availability of data, a shorter data cleaning and release process, and higher initial costs of mobile devices. Keywords: digital data collection, data quality, secure data Introduction In planning local economic development, the South African government and private sector relies on its national statistical organization (NSO), Stats SA 3 , for high quality, digitised data, these includes census ’es , demographic and economic data series on which to build national development strategies. In 2015/16, Stats SA, like most NSOs in Africa, relied on manual, paper- based data collection methods in survey work to produce the majority of data/statistics that informs local economic development policy. Paper-based methods involve time intensive and expensive processes, including the printing of paper questionnaires, manual and/or scanned data entry and processing of the collected data. These processes not only delay the production of data for decision making, but also require more personnel than comparative digital data collection and processing processes, thereby contributing to high costs. Moreover, manual errors are hard to avoid, compromising data quality. In comparison to DDC, manual, paper based methods used by NSOs can be summarized as being slower, expensive, and more complex to manage from a quality assurance perspective. In order to keep abreast with ever-changing technologies, current 1 Department of Information Systems, University of the Western Cape, South Africa. Corresponding author: mhattas@gmail.com 2 Department of Information Systems, University of the Western Cape, South Africa. 3 http://www.statssa.gov.za
methodologies, processes, systems and quality standards such as the South African Statistical Quality Assessment Framework (SASQAF) (8) should be reviewed for potential gains in social efficiencies, increasing levels of quality, increasing productivity as well as enhancing the economic potential of a country's developmental objectives such as South Africa. During 2015 and 2016 a wide range of new digital data collection (DDC) methods and sources has become available to Stats SA, primarily as a result of recent improvements in technical hardware (1) - lower cost and better technical features of mobile devices- and connectivity in South Africa (2) - mobile ownership and network coverage. Since 2015 such mobile technologies and DDC processes have been deployed and tested by Stats SA for household survey data collection to achieve faster data availability on a selection of key indicators in near real-time. This paper provides early insight into an iterative action research approach adopted by the organisation for Digital Data Collection (DDC) implementation. Key concerns that received attention were the speed of data collection, the timeous availability of data, the costs of data collection, and data quality. Two 2015/2016 DDC pilot projects are discussed and analysed. The latter pilot, in particular, produced some valuable guidelines for future DDC implementations by African NSO’s . Background The current trend of African NSOs transforming their manual processes towards DDC is motivated by the need to ensure the (i) maximisation of profit, precision, and accuracy of data collection (ii) the reduction of non-responses and item responses in surveys, (iii) the security of data (viz. ensure confidentiality and integrity) and (iv) increasing the overall quality of collection processes (3).Community surveys have typically involved capturing data using paper questionnaires at the household level and then sending completed questionnaires to a data processing center as a base for data entry and data cleaning. Current technological advances and economic trends motivated hand-held computers or personal digital assistants (PDA) as a viable alternative to manual data collection (4). Proponents would argue that direct data capture at the point of interview can reduce error rates and speed up the data cleaning process, and hence make databases available for analysis significantly sooner (4). Literature Stats SA, by virtue of its annual household surveys (including Community Surveys) and national censuses, requires cheap, versatile (mobile) technologies for use by in-person field interviewers i.e. Personal interviews carried out with enumerators recording responses directly on smart phones or data connected tablets (Computer Assisted Personal Interview (CAPI)). CAPI can drastically improve the speed and quality with which survey data is collected. With increasing availability of mobile data networks and internet, instant transmission of collected data to cloud- based servers is increasingly viable and circumvents the tedious traditional data entry process. Using integrated cloud services also makes data available for processing and analysis in near real-time. Computer-assisted data collection (CADAC) includes both computer-assisted interviewing and online data collection, with the latter eliminating the need for an interviewer. From the 1970 ’ s
Recommend
More recommend