Managing sensitive data and authorship in Humanities and Social Sciences Louise Corti Collections Development and Producer Support ODIN conference, Cologne October 2013
Overview • Introducing the UK Data Service • Our data portfolio and users • Citation, impact measurement and DOIs • Challenges for social science citation
The UK Data Archive • Based at the University of Essex, since 1967 • 45 years of selecting, ingesting, curating and providing access to social science data • designated as Place of Deposit by The National Archives • Data and data support services for higher and further education for research, teaching and learning • Recently attained the highest information security standard, ISO 27001
University of Essex The Archive
SISTER DATA ARCHIVES Council of European Social Science Data Archives (CESSDA ) ICPSR (USA) Inter-University Consortium for Political and Social Research ADA Australian Social Science Data Archive
What is the UK Data Service? • Comprehensive data resource funded by the UK Economic and Social Research Council • Single virtual point of access to a wide range of secondary data for social science research (Directed from Essex) • Offer promotion, support, training and guidance
What does the UK Data Service do? • Put together a collection of the most valuable data • Preserve data for the long term for future research purposes • Make the data and documentation available for reuse • Provide data management advice for data creators • Provide training and support for users of the service • Bring together owners, producers and users • Demonstrate impact through evidence of usage • Easy access through website - ukdataservice.ac.uk
Who is our service for? • Data for secondary analysis, research, policy making • Teaching and learning • Academic researchers and students • Government analysts • Charities and foundations • Business consultants • Independent research centres • Think tanks
Our data portfolio • Over 6,000 datasets in the collection • 230 new datasets added each year • Official agencies - mainly central government • International statistical time series • I ndividual academic’ research grants • Market research agencies • Public records/historical sources • Access to international data via links with other data archives worldwide
UK survey series • High quality repeated cross-sectional surveys • Individual or household level data • Cover many topics including health, work, crime, social attitudes, family expenditure, living costs, housing etc. • Labour Force Survey • British Crime Survey • Health Survey for England • British Social Attitudes • Annual Population Survey ….
Cross-national surveys and macro databanks • Eurobarometers • European Social Survey • European Values Survey • International Social Survey Programme • Time series data aggregated to country/region • International governmental organisations (IMF, OECD, IEA, World Bank)
Longitudinal studies • British Household Panel Survey and Understanding Society • Understanding Society (2009-) • English Longitudinal Study of Ageing • Families and Children Study • Growing Up in Scotland • Longitudinal Study of Young People in England
UK census data • 1971-2011 census data • Baseline for other statistics • Detailed combinations of characteristics • Small geographies • Census outputs • Aggregate data • Boundary data • Flow data • Microdata
Business data • Collected through a wide range of surveys, and administrative sources: • productivity, innovation, workforce skills, earnings • international trade, foreign direct investment • research and development • business demography • industrial relations
Qualitative data • Interviews, focus groups • Essays, diaries, open-ended survey questions • Observations, case notes etc. • Family Life and Work Experience before 1918, Middle and Upper Class Families in the Early 20th Century,1870-1977 • Gender Difference, Anxiety and the Fear of Crime, 1995 • Mothers Alone: Poverty and the Fatherless Family, 1955-1966
Usage of data • Operate a spectrum of access • over 22,000 • Web download under End registered users User Licence • approximately • Permission only via Special 60,000 downloads worldwide p.a. Licence access • ‘Approved researcher’ access • 3,000+ user support queries via remote secure access • End user licence includes: • Appropriate data usage • Full citation of data and informing us of re-use • Have always provided a citation format
Evidence of access and re-use User access information • Collect user information and ‘projects’ upon registration • Collate data and documentation download statistics • Users can share project information for others to see • Report data access stats on demand Usage information • Email all users every 6 months after registration about activity • Manually add all research outputs references to the data record • Reporting rate of publications is poor! • Prior to DOIs, have scanned citation literature for dataset mentions – very manual and unreliable, and poorly cited
Impactful case studies of use • Identify and seek out case studies of re-use: research or teaching. • Very successful! • 125 case studies in our database • Can help provide impact stories for data owners/producers and users • And can inspire others! • Some are harvested by ESRC for their website • Often include ongoing work – no need to wait for publications
Our Persistent identifiers approach • Our data collections are not digital objects • Need to capture changes made to data • Versioning data in a commonly understood manner • Needed rule-based definition of a ‘ significant ’ change • Integrate processes with digital preservation activities & work flows • In 2011 we assigned Datacite DOIs for all of our collections • Mint and update DOIs with our metadata management infrastructure
Recording significant change • Approx. 15% UKDA data collections are altered within first year after first publication • We have distinguished between major and minor changes to a data collection = high impact vs. low impact • DOI allocated to a metadata instance of a data collection • DOIs resolve to jump page pointing to all external instances • New DOI = High Impact change, with explicit logging • Provided access only to most up-to-date version of data
Major changes – high impact • New variable added • New labels/value codes added • Weighting variables reconstructed • Wrong data supplied (e.g., March not April) • Mis- coded data (e.g., Don’t know/Refused confused) • Change in format (file migration) • Significant changes in documentation • Change in access conditions
Raising awareness in the social sciences • ESRC funding for short-term project on citation • Advocacy for best practice in citing research data • Audiences • Professional organisations • Academic publishers and journal editors • Researchers and postgraduates • Key activities • Data citation principles for social sciences • Personal communications • Events with BL DataCite, JISC and wider PI community • Outreach through Doctoral Training Centres
Making
Demonstrating impact with citation • Assuming b etter use of DOIS… • Starting to search for use of our DOIs – Google • Automate this process and compile reports; promote • Gather data citation statistics from Thomson Reuters Data Citation Index. One of the early 20 feeder repositories, but our own access limited! • Work with BL Datacite and ODIN to gain connectivity between identifiers & outputs – early adopters
CHALLENGES FOR THE FUTURE • Citing parts (fragments) of data collections • single files • subsets of quantitative data • extracts of textual data • ESRC project Digital Futures will enable extract level citation within a web-based browsing system • Using rich highly structured XML metadata • GUIDS for everything
UK Quali Bank
Resolving citation objects • Will enable extract level citation • Citation object and citation format created on the fly – using GUIDS and URI • URI resolves directly to the data extract • Some more sensitive collections will be closed, so cannot resolve to data • As yet uncertain of relationship to our collection- level DOIs
CONTACT UK Data Service University of Essex Wivenhoe Park Colchester Essex CO4 3SQ ……………..…..……………………….. • T +44 (0)1206 872001 E corti@essex.ac.uk
Recommend
More recommend