aine sensiblecode io duncan sensiblecode io 2021 census
play

aine@sensiblecode.io duncan@sensiblecode.io 2021 Census Outputs - PowerPoint PPT Presentation

aine@sensiblecode.io duncan@sensiblecode.io 2021 Census Outputs and Dissemination Update Suzie Dunsmith & Neil Townsend ONS June 2019 Were committed to delivering 2021 Census results earlier, more flexibly and with greater


  1. aine@sensiblecode.io duncan@sensiblecode.io

  2. 2021 Census Outputs and Dissemination Update Suzie Dunsmith & Neil Townsend ONS June 2019

  3. We’re committed to delivering 2021 Census results earlier, more flexibly and with greater accessibility • Using innovative methods developed by our Statistical Disclosure Control experts we designed an approach to dissemination which meets these aims • Last year we worked with Sensible Code Company who built these methods into a “proof of concept” prototype • We are now developing methods, processes and specifications across several workstreams June 2019

  4. • Output content – derived variables, classifications, geography etc • Analysing table design to inform dissemination development • Origin-destination outputs • Microdata samples • Admin data integration • Metadata incl W elsh language requirements • NS accreditation – OSR consultation • Analysis and data visualisation • UK data June 2019

  5. Any questions or feedback please contact: About the 2021 Census and/or other areas of ONS: Respond to the Of fice for Statistics Regulation’ s: user consultation June 2019

  6. AGENDA Intr oductions SensibleCode/Welsh Government Census 2021 ONS, UK C ase S tudy Inter active session/ Q&A 6 | Sensible Code

  7. Our Challenge “ To learn more about the challenges being faced by professionals who are considering privacy issues on a regular basis; how they address these issues given the desire to open data and the fact that many more sources of data are being made available. What's being considered and the factors influencing these decisions ” 7 | Sensible Code

  8. We make products that modernise the processing and dissemination of data 8 | Sensible Code

  9. Problems disclosure control is a manual process ● sur ge of new data sources ● ● incr easing capacity is a challenge ● pressur e t o publish more and sooner ● privacy preser ving tech is new & landscape is foggy 9 | Sensible Code

  10. 10 | Sensible Code

  11. ● NSIs want to modernise and automate SDC. ● Disseminating data closer t o the collection date increases their value to the economy. ● Users expect to be able to see more gr anular data for more diverse populations. ● Users want to query the data more flexibly . 11 | Sensible Code

  12. Flexible dissemination through real-time application of disclosure control techniques in response to user queries 12 | Sensible Code

  13. 13 | Sensible Code

  14. 14 | Sensible Code

  15. TableBuilder: what does it do? Best-in-class aggregation speed using an optimized data format ● Allow users t o choose “ any ” output table within limits ● ○ dubbed “ Fle xible Dissemination ” ● In r eal-time : ○ apply perturbative Statistical Disclosur e Contr ol ( SDC ) ○ use SDC rules post perturbation and r edact data if necessar y 15 | Sensible Code

  16. How it works 16 | Sensible Code

  17. Census Data Person dataset Household dataset 57 million rows 22 million rows 28 variables 11 variables 41 mappings 11 mappings Join both datasets to associate household variables and mappings with people 17 | Sensible Code

  18. Geographical Data 2 Countries 10 Regions 350 L ocal Authorities (LA) 7,200 Middle Layer (MSO A) 35,000 L ower Layer (LSO A) 180,000 Output Areas (O A) (aver age about 300 people) 18 Sensible Code |

  19. Statistical Disclosure Control (SDC) ● TableBuilder does perturbation using the cell-key method some modifications for ONS ○ consist ent zero perturbation: always query whole data set ○ ● Apply post-per turbation rules ○ a publishable table must pass all of the rules 19 | Sensible Code

  20. SDC: Handling “Structural” Zeros ● Naive approach Force zero the “ impossible ” combinations of categories ○ P roblem: enumerating all the combinations ○ ● T ableBuilder: automatic preservation of structural zeros ○ Use zer o count at higher geographic level as indicator ○ Sensitive to geographic variation 20 | Sensible Code

  21. 21 | Sensible Code

  22. SDC: Which tables can be published? ● Formalise SDC “ rules ” Publishable tables must pass all of the rules ○ Selective by geography ● ○ mor e tables ar e available in areas with diverse population ● Data controllers can xperiment with rule parameters e 22 | Sensible Code

  23. 23 | Sensible Code

  24. Demonstration

  25. Q & A - Thank you aine@sensiblecode.io

Recommend


More recommend