satisfy legal definitions of privacy
play

Satisfy Legal Definitions of Privacy? The Case of FERPA and - PowerPoint PPT Presentation

Do Computer Science Definitions of Privacy Satisfy Legal Definitions of Privacy? The Case of FERPA and Differential Privacy CS LAW Kobbi Nissim Ben-Gurion University Center for Research on Computation and Society


  1. Do Computer Science Definitions of Privacy Satisfy Legal Definitions of Privacy? The Case of FERPA and Differential Privacy CS LAW Kobbi Nissim Ben-Gurion University Center for Research on Computation and Society Harvard University Priv rivacy Enhancin ing Technolo logie ies for r Bio Biometric ic Data Haifa University, Jan 17, 2016

  2. This Work: a Collaboration CS LAW Product of a working group (meeting since Nov 2014) Contributing to this project: • Center for Research on Computation and Society (CRCS): • Kobbi Nissim, Aaron Bembenek, Mark Bun, Marco Gaboardi, Thomas Steinke, and Salil Vadhan • Berkman Center for Internet & Society: • David O ’ Brien, Alexandra Wood, and Urs Gasser

  3. Privacy Tools for Sharing Research Data • Goal: help social scientists share privacy-sensitive research data via a collection of technological and legal tools • A problem: privacy protection techniques repeatedly shown to fail to provide reasonable privacy

  4. Data Privacy • Studied (at least) from the 60s • Approaches: De-identification, redaction, auditing, noise addition, synthetic datasets … • Focus on how to provide privacy, not on what privacy protection is • May have been suitable for the pre-internet era • Re- identification [Sweeney ’00, …] • GIS data, health data, clinical trial data, DNA, Pharmacy data, text data, registry information, … • Blatant non- privacy [Dinur, Nissim ‘03], … • Auditors [Kenthapadi, Mishra, Nissim ’05] • AOL Debacle ‘06 • Genome- Wide association studies (GWAS) [Homer et al. ’08] • Netflix award [Narayanan, Shmatikov ‘09] • Netflix canceled second contest • Social networks [Backstrom , Dwork, Kleinberg ‘11] • Genetic research studies [Gymrek, McGuire, Golan, Halperin, Erlich ‘11] • Microtargeted advertising [Korolova 11] • Recommendation Systems [Calandrino, Kiltzer, Naryanan, Felten, Shmatikov 11] • Israeli CBS [Mukatren , Nissim, Salman, Tromer ’14] • Attack on statistical aggregates [Homer et al.’08] [Dwork, Smith, Steinke, Vadhan ‘15 ] • … Slide idea stolen shamelessly from Or Sheffet

  5. Privacy Tools for Sharing Research Data • Goal: help social scientists share privacy-sensitive research data via a collection of technological and legal tools • A problem: privacy protection techniques repeatedly shown to fail to provide reasonable privacy • Differential privacy [Dwork, McSherry, N, Smith 2006] • A formal mathematical privacy concept • Addresses weaknesses of traditional schemes (and more) • Has a rich theory, in first steps of implementation and testing

  6. The Protagonists Differential Privacy A mathematical definition of privacy 𝑁: 𝑌 𝑜 → 𝑈 satisfies 𝜗 -differential privacy if FERPA (the Family Educational Rights and ∀𝑦, 𝑦 ′ ∈ 𝑌 𝑜 s.t. 𝑒𝑗𝑡𝑢 𝐼 𝑦, 𝑦 ′ = 1 ∀𝑇 ⊆ 𝑈 , Privacy Act) M 𝑁 𝑦 ∈ 𝑇 ≤ 𝑓 𝜗 Pr M 𝑁 𝑦 ′ ∈ 𝑇 . A legal standard of privacy Pr

  7. A use case: The Privacy Tools for Sharing Research Data Project * http://privacytools.seas.harvard.edu/

  8. Contains Should I apply student info for access??? protected by FERPA Other IRB policies, terms of use … researchers may is it worth the trouble? find it useful … Alice Bob Dataverse Network Alice ’ s data please Restricted! Access to Alice ’ s data Alice ’ s cool w/differential privacy MOOC data Privacy Tools Does DP satisfy FERPA? * http://dataverse.org/ http://privacytools.seas.harvard.edu/

  9. Short digression: Motivating Differential Privacy

  10. Yay! Just before this talk … an interesting discussion … How many Justin Bieber fans attend the workshop? Highly sensitive personal info; how can this be done? Great! … I will only I will do publish the the survey… result Hooray! … and Trusted Party immediately forget the data! Yay!

  11. 3 #JustinBieber fans attend #Haifa-privacy-workshop

  12. A survey! How many JB A few minutes later - I come in … fans attend the wkshop? … publish the result I will do the survey … … and forget the Trusted Party data! What are you doing? Me too!

  13. 3 #JustinBieber fans attend #Haifa-privacy-workshop 4/100 chance Each Kobbi is a JB The tweet fan hides my info! Aha! (after @Kobbi joins): 4 #JustinBieber fans attend #Haifa-privacy-workshop

  14. Composition • Differencing attack: • How is my privacy affected when an attacker sees analysis before and after I join/leave? • More generally: Composition • Ho is my privacy affected when an attacker combines results from two or more privacy preserving analyses? • Fundamental law of information: the more information we extract from our data, the more is learned about individuals! • So, privacy will deteriorate as we use our data more and more • Best desiderata: • Deterioration is quantifiable and controllable • Not abrupt

  15. The Protagonists: Differential Privacy

  16. My Privacy Desiderata Real world: Kobbi ’ s data Analysis Outcome (Computation) Data same outcome My ideal world: Data Analysis Outcome w/my (Computation) info removed

  17. Things to Note • In this talk, we only consider the outcome of analyses • Security flaws, hacking, implementation errors, … • Very important but very different questions • My privacy desiderata would hide whether I ’ m a JB fan! • Resilient to differencing attacks • Does not mean I ’ m fully protected • I ’ m only protected to the extent I ’ m protected in my ideal world • Some harm could happen to me even in my ideal world • Bob smokes in public • Study teaches that smoking causes cancer • Bob ’ s health insurer raises his premium • Bob is harmed even if he does not participate in the study!

  18. Our Privacy Desiderata Should ignore Kobbi ’ s info Real world: Analysis Outcome (Computation) Data same outcome My ideal world: Data Analysis Outcome w/my (Computation) info removed

  19. Our Privacy Desiderata Should ignore Kobbi’s info and Gertrude’s! Real world: Analysis Outcome (Computation) Data same outcome Gert ’ s ideal world: Data Analysis Outcome w/Gert ’ s (Computation) info removed

  20. Our Privacy Desiderata Should ignore Kobbi ’ s info and Gertrude ’ s! and Mark’s! Real world: Analysis Outcome (Computation) Data same outcome Mark’s ideal world: Data Analysis Outcome w/ Mark ’s (Computation) info removed

  21. Our Privacy Desiderata Should ignore Kobbi ’ s info and Gertrude’s! and Mark’s! Real world: … and everybody ’ s! Analysis Outcome (Computation) Data same outcome  ’s ideal world: Data Analysis Outcome w/  ’s (Computation) info removed

  22. A Realistic Privacy Desiderata Real world: Analysis Outcome (Computation) Data same outcome ε - ” similar ”  ’ s ideal world: Data Analysis Outcome w/  ’ s (Computation) info removed

  23. Differential Privacy [Dwork McSherry N Smith 06] Real world: Analysis Outcome (Computation) Data ε - ”similar”  ’ s ideal world: Data Analysis Outcome w/  ’s (Computation) info removed *See also: Differential Privacy: An Introduction for Social Scientists.

  24. Why Differential Privacy? • DP: Strong, quantifiable, composable mathematical privacy guarantee • Provably resilient to known and unknown attack modes! • Natural interpretation: I am protected (almost) to the extent I’m protected in my privacy-ideal scenario • Theoretically, DP enables many computations with personal data while preserving personal privacy • Practicality in first stages of validation

  25. Differential Privacy [Dwork McSherry N Smith 06] 𝑁: 𝑌 𝑜 → 𝑈 satisfies 𝜗 -differential privacy if ∀𝑦, 𝑦 ′ ∈ 𝑌 𝑜 s.t. 𝑒𝑗𝑡𝑢 𝐼 𝑦, 𝑦 ′ = 1 ∀𝑇 ⊆ 𝑈 , M 𝑁 𝑦 ∈ 𝑇 ≤ 𝑓 𝜗 Pr M 𝑁 𝑦 ′ ∈ 𝑇 . Pr

  26. It ’ s Real!

  27. How is Differential Privacy Achieved? • Careful addition of random noise into the computation: • Randomized Response [W65], Framework of global sensitivity [DMNS05], Framework of smooth sensitivity [NRS07], Sample and aggregate [NRS07], Exponential mechanism [MT07], Propose test release [DL09], Sparse vector technique [DNRRV09], Private multiplicative weights [HR10], Matrix mechanism [LHRMM10], Choosing mechanism [BNS13], Large margin mechanism [CHS14], Dual query mechanism [GGHRW14 ], … • Differentially private algorithms exists for many tasks: • Statistics, machine learning, private data release, …

  28. Some Other Efforts to Bring DP to Practice • Microsoft Research “ PINQ ” • CMU-Cornell-PennState “ Integrating Statistical and Computational Approaches to Privacy ” (See http://onthemap.ces.census.gov/) • UCSD “ Integrating Data for Analysis, Anonymization, and Sharing ” (iDash) • UT Austin “ Airavat: Security & Privacy for MapReduce ” • UPenn “ Putting Differential Privacy to Work ” • Stanford-Berkeley-Microsoft “ Towards Practicing Privacy ” • Duke-NISSS “ Triangle Census Research Network ” • MIT/CSAIL/ALFA "MoocDB Privacy tools for Sharing MOOC data" • …

  29. The Protagonists: FERPA

Recommend


More recommend