LightDP: Towards Automating Differential Privacy Proofs Danfeng Zhang Daniel Kifer Penn State University
Database w/o Database w/ Aliceβs data Aliceβs data π " π $ Aliceβs data remain private if π " , π $ are close 2
(Pure) Differential Privacy π $ (π€) π " (π€) π " π $ If for any adjacent databases and value π€ , π " (π€)/π $ (π€) β€ π + for some constant π , then a computation is π -private 3 Privacy Cost
Motivation DP has seen explosive growth since 2006 β’ U.S. Census Bureau LEHD OnTheMap tool [Machanavajjhala et al. 2008] β’ Google Chrome Browser [Erlingsson et al. 2014] β’ Appleβs new data collection efforts [Greenberg 2016] But also accompanied with flawed (paper-and- pencil) proofs β’ e.g., ones categorized in [Chen&Machanavajjhalaβ15, Lyu et al.β16] Rigorous methods are needed for differential privacy proofs 4
Related Work DP programming platforms (e.g., PINQ, Airavat) β’ Use (instead of verify) basic DP mechanisms β’ Cannot offer tight bounds for sophisticated algorithms Methods based on customized logics β’ Steep learning curve β’ Heavy annotation burden LightDP offers a better balance between expressiveness and usability 5
LightDP: Overview Target Program with distinguished variable Source Program Relational, Dependent Type System Main Theorem v + bounded by constant π Source program type checks in the target program Source program is π -private 6
Source Language: Syntax Random Expression Random (e.g., Laplace dist.) variable 7
Source Language: Semantics Memory: mapping from variables to values Initial memory Adjacent memory Relational Reasoning via Type System Final memory dist. Final memory dist. 8
Relational Types Base Type Distance Example Related Memories Ξ π¦ : num 6 π¦: u π¦: u Ξ(π§): num " π§ : v π§ : v+1 e.g., int, real 9
Dependent Types Can be a program variable Example Related Memories Ξ π¦ : num 6 π¦: u π¦: u Ξ(π§): num 8 π§ : v π§ : v + u 10
Dependent Types Can be a non-prob. expression Example Related Memories Ξ π¦ : num 6 π¦: u π¦: u π§ : v π§ : 9v + 2, u β₯ 1 Ξ(π§): num 8>"?$:6 v ,u < 1 Notation π " Ξ π $ if π " and π $ are related by Ξ 11
(for the non-probabilistic subset) Types form an invariant on two related program executions: Ξ π " If initial memories π $ Then after executing a well-typed program, Ξ A A π $ π " final memories Enforced by a type system 12
Type System Expression: e.g., + | β < | > | = | β€ | β₯ 13
Type System Distance must Command: be identical e.g., Related executions take same branch 14
Relating Two Distributions π " Ξ π $ w.r.t. privacy cost π if βπ. π " (π)/π $ (Ξ(π)) β€ π + π " π $ Laplace dist. w/ mean 0 and a scale factor π Program π := Lap π With no Ξ π = num 6 cost 15
Relating Two Distributions π " Ξ π $ w.r.t. privacy cost π if βπ. π " (π)/π $ (Ξ(π)) β€ π + π " π $ Laplace dist. w/ mean 0 Observation and a scale factor π Program π may have an arbitrary distance , π := Lap π which affects the added cost With cost Ξ π = num " π/π due to dist. property 16
Observation π may have an arbitrary distance , which affects the added cost π has a polymorphic type Non-deterministic operation target program, explicitly source program tracks added privacy cost Intuitively, target program computes the added cost for one sample from distribution π 17
In General Target program with Source program distinguished variable Type System source program target program 18
Target Language set x to arbitrary value Verification task in the target language: Proving is bounded by some constant π in any execution (in a non-probabilistic program) A safety property. Can be verified using off-the-shelf tools (e.g., Hoare logic, model checking) 19
Putting Together The Sparse Vector Method [Dwork and Rothβ14] Source Program β’ Correctness proof is subtle Incorrect variants categorized in [Chen&Machanavajjhalaβ15, Lyu et al.β16] β’ Formally verified very recently [Barthe et al. 2016] with heavy annotation burden 20
Required Types Distance depends on the value of πth query answer ( π[π] ) Type Inference Types can be inferred by the inference algorithm of LightDP 21
Target Program 22
Completing the Proof Loop Invariant Main Theorem Source program type checks + bounded by constant π Postcondition: = source program is π -private 23
More in the Paper Type inference algorithm Searching for proof with minimum cost w/ MaxSMT Formal proof for the main theorem More verified examples (with little manual efforts) 24
Summary A safety property (verified by existing tools) Target Program with Automated by distinguished variable inference engine Source Program Relational, Dependent Type System Decomposing differential privacy into subtasks substantially simplifies language-based proof 25
Recommend
More recommend