lightdp towards automating differential privacy proofs
play

LightDP: Towards Automating Differential Privacy Proofs Danfeng - PowerPoint PPT Presentation

LightDP: Towards Automating Differential Privacy Proofs Danfeng Zhang Daniel Kifer Penn State University Database w/o Database w/ Alices data Alices data " $ Alices data remain private if " , $ are


  1. LightDP: Towards Automating Differential Privacy Proofs Danfeng Zhang Daniel Kifer Penn State University

  2. Database w/o Database w/ Alice’s data Alice’s data 𝜈 " 𝜈 $ Alice’s data remain private if 𝜈 " , 𝜈 $ are close 2

  3. (Pure) Differential Privacy 𝜈 $ (𝑀) 𝜈 " (𝑀) 𝜈 " 𝜈 $ If for any adjacent databases and value 𝑀 , 𝜈 " (𝑀)/𝜈 $ (𝑀) ≀ 𝑓 + for some constant πœ— , then a computation is πœ— -private 3 Privacy Cost

  4. Motivation DP has seen explosive growth since 2006 β€’ U.S. Census Bureau LEHD OnTheMap tool [Machanavajjhala et al. 2008] β€’ Google Chrome Browser [Erlingsson et al. 2014] β€’ Apple’s new data collection efforts [Greenberg 2016] But also accompanied with flawed (paper-and- pencil) proofs β€’ e.g., ones categorized in [Chen&Machanavajjhala’15, Lyu et al.’16] Rigorous methods are needed for differential privacy proofs 4

  5. Related Work DP programming platforms (e.g., PINQ, Airavat) β€’ Use (instead of verify) basic DP mechanisms β€’ Cannot offer tight bounds for sophisticated algorithms Methods based on customized logics β€’ Steep learning curve β€’ Heavy annotation burden LightDP offers a better balance between expressiveness and usability 5

  6. LightDP: Overview Target Program with distinguished variable Source Program Relational, Dependent Type System Main Theorem v + bounded by constant πœ— Source program type checks in the target program Source program is πœ— -private 6

  7. Source Language: Syntax Random Expression Random (e.g., Laplace dist.) variable 7

  8. Source Language: Semantics Memory: mapping from variables to values Initial memory Adjacent memory Relational Reasoning via Type System Final memory dist. Final memory dist. 8

  9. Relational Types Base Type Distance Example Related Memories Ξ“ 𝑦 : num 6 𝑦: u 𝑦: u Ξ“(𝑧): num " 𝑧 : v 𝑧 : v+1 e.g., int, real 9

  10. Dependent Types Can be a program variable Example Related Memories Ξ“ 𝑦 : num 6 𝑦: u 𝑦: u Ξ“(𝑧): num 8 𝑧 : v 𝑧 : v + u 10

  11. Dependent Types Can be a non-prob. expression Example Related Memories Ξ“ 𝑦 : num 6 𝑦: u 𝑦: u 𝑧 : v 𝑧 : 9v + 2, u β‰₯ 1 Ξ“(𝑧): num 8>"?$:6 v ,u < 1 Notation 𝑛 " Ξ“ 𝑛 $ if 𝑛 " and 𝑛 $ are related by Ξ“ 11

  12. (for the non-probabilistic subset) Types form an invariant on two related program executions: Ξ“ 𝑛 " If initial memories 𝑛 $ Then after executing a well-typed program, Ξ“ A A 𝑛 $ 𝑛 " final memories Enforced by a type system 12

  13. Type System Expression: e.g., + | βˆ’ < | > | = | ≀ | β‰₯ 13

  14. Type System Distance must Command: be identical e.g., Related executions take same branch 14

  15. Relating Two Distributions 𝜈 " Ξ“ 𝜈 $ w.r.t. privacy cost 𝝑 if βˆ€π‘›. 𝜈 " (𝑛)/𝜈 $ (Ξ“(𝑛)) ≀ 𝑓 + 𝜈 " 𝜈 $ Laplace dist. w/ mean 0 and a scale factor 𝑠 Program πœƒ := Lap 𝑠 With no Ξ“ πœƒ = num 6 cost 15

  16. Relating Two Distributions 𝜈 " Ξ“ 𝜈 $ w.r.t. privacy cost πœ— if βˆ€π‘›. 𝜈 " (𝑛)/𝜈 $ (Ξ“(𝑛)) ≀ 𝑓 + 𝜈 " 𝜈 $ Laplace dist. w/ mean 0 Observation and a scale factor 𝑠 Program πœƒ may have an arbitrary distance , πœƒ := Lap 𝑠 which affects the added cost With cost Ξ“ πœƒ = num " 𝟐/𝒔 due to dist. property 16

  17. Observation πœƒ may have an arbitrary distance , which affects the added cost πœƒ has a polymorphic type Non-deterministic operation target program, explicitly source program tracks added privacy cost Intuitively, target program computes the added cost for one sample from distribution 𝜈 17

  18. In General Target program with Source program distinguished variable Type System source program target program 18

  19. Target Language set x to arbitrary value Verification task in the target language: Proving is bounded by some constant πœ— in any execution (in a non-probabilistic program) A safety property. Can be verified using off-the-shelf tools (e.g., Hoare logic, model checking) 19

  20. Putting Together The Sparse Vector Method [Dwork and Roth’14] Source Program β€’ Correctness proof is subtle Incorrect variants categorized in [Chen&Machanavajjhala’15, Lyu et al.’16] β€’ Formally verified very recently [Barthe et al. 2016] with heavy annotation burden 20

  21. Required Types Distance depends on the value of 𝑗th query answer ( π‘Ÿ[𝑗] ) Type Inference Types can be inferred by the inference algorithm of LightDP 21

  22. Target Program 22

  23. Completing the Proof Loop Invariant Main Theorem Source program type checks + bounded by constant πœ— Postcondition: = source program is πœ— -private 23

  24. More in the Paper Type inference algorithm Searching for proof with minimum cost w/ MaxSMT Formal proof for the main theorem More verified examples (with little manual efforts) 24

  25. Summary A safety property (verified by existing tools) Target Program with Automated by distinguished variable inference engine Source Program Relational, Dependent Type System Decomposing differential privacy into subtasks substantially simplifies language-based proof 25

Recommend


More recommend