violations by sampling and optimization
play

Violations by Sampling and Optimization Dana Benjamin Bichsel - PowerPoint PPT Presentation

DP-Finder: Finding Differential Privacy Violations by Sampling and Optimization Dana Benjamin Bichsel Timon Gehr PetarTsankov Martin Vechev Drachsler-Cohen Differential Privacy Basic Setting # disease 7 2 Differential Privacy Basic


  1. DP-Finder: Finding Differential Privacy Violations by Sampling and Optimization Dana Benjamin Bichsel Timon Gehr PetarTsankov Martin Vechev Drachsler-Cohen

  2. Differential Privacy – Basic Setting # disease 7 2

  3. Differential Privacy – Basic Setting # disease 7.3 + noise What about my privacy? 3

  4. Differential Privacy - Intuition ? # disease or 7.3 + noise Change my data # disease 7.6 + noise 4

  5. Differential Privacy – More Abstractly 𝑦 Attacker check 𝐺 𝐺(𝑦) 𝐺(𝑦) ∈ Φ ? Neighboring 𝑦′ Attacker check 𝐺 𝐺(𝑦′) 𝐺(𝑦′) ∈ Φ ? 5

  6. Differential Privacy - Definition 𝑦 𝜁 -DP: Attacker check 𝐺 Pr[ 𝐺 𝑦 ∈ Φ ] 𝐺(𝑦) 𝐺(𝑦) ∈ Φ ? Pr[ 𝐺(𝑦 ′ ) ∈ Φ ] ≤ exp 𝜁 ≈ 1 + 𝜁 Neighbouring Challenges induced by DP: 𝑦′ • Proving/checking 𝜁 -DP is hard Attacker check 𝐺 (buggy algorithms) 𝐺(𝑦′) 𝐺(𝑦′) ∈ Φ ? • Proof strategies not complete • Proofs only provide upper bounds 6

  7. 𝜁 -DP Counterexamples ( , , ) Φ 𝑦 𝑦′ that violate 𝜁 - DP: Pr[𝐺 𝑦 ∈ Φ] Pr[𝐺(𝑦 ′ ) ∈ Φ] > exp 𝜁 ⟺ log Pr[ 𝐺 𝑦 ∈ Φ ] Pr[ 𝐺(𝑦 ′ ) ∈ Φ ] > 𝜁 7

  8. 𝜁 -DP Counterexamples ( , , ) Φ 𝑦 𝑦′ that violate 𝜁 - DP: Pr[𝐺 𝑦 ∈ Φ] Pr[𝐺(𝑦 ′ ) ∈ Φ] > exp 𝜁 ⟺ log Pr[ 𝐺 𝑦 ∈ Φ ] Maximize Pr[ 𝐺(𝑦 ′ ) ∈ Φ ] > 𝜁 ε(𝑦, 𝑦 ′ , Φ) 8

  9. Bounds on "true" 𝜁 Counterexample: Counterexample: Counterexample: 5% -DP 9.9% -DP 15% -DP Proven: 10% -DP ( 𝜁 = 10% = 0.1 ) Evaluation : We get precise and large ε , close to known upper bounds 9

  10. Ƹ Ƹ 𝜁 -DP Counterexamples Goal : Maximize ε(𝑦, 𝑦 ′ , Φ) Challenge 2 : Search space is Challenge 1 : Expensive to sparse: Few 𝑦, 𝑦 ′ , Φ lead to compute ε precisely large ε(𝑦, 𝑦 ′ , Φ) 𝜁 𝑒 𝜁 𝜁 Estimate 𝜁 𝜁 Make Ƹ by sampling differentiable 10

  11. Ƹ Step 1: Estimate 𝜁 𝜁 𝜁 Estimate 𝜁 by sampling 11

  12. Estimating 𝜁 𝜁 x, x ′ , Φ ≔ log Pr[ 𝐺 𝑦 ∈ Φ ] Pr[ 𝐺(𝑦 ′ ) ∈ Φ ] 12

  13. Estimating 𝜁 𝑜 Pr 𝐺(𝑦) ∈ Φ = 1 𝜁 x, x ′ , Φ ≔ log Pr[ 𝐺 𝑦 ∈ Φ ] 𝑗 ෢ check 𝐺,Φ (𝑦) 𝑜 ෍ Pr[ 𝐺(𝑦 ′ ) ∈ Φ ] 𝑗=1 𝑗 𝐺(𝑦) check 𝐺,Φ (𝑦) 𝑦 yes 𝐺 7.3 33% no 𝐺 7.6 67% yes 𝐺 6.8 13

  14. How precise is our estimate? Counterexample: 9.9% ± 10% -DP vs Counterexample: 9.9% ± 2 ∙ 10 −3 -DP Precision of Pr[ 𝐺 𝑦 ∈ Φ ] Sampling Precision of 𝜁 effort 𝑜 and Pr[ 𝐺 𝑦′ ∈ Φ ] Exponential search 14

  15. Estimating precisely is expensive Probabillstic guarantees Heuristic Efficient Heuristic 10 4 Estimating 𝜁 up to an error of 2 ∙ 10 −3 with confidence of 90% 15

  16. Applying the M-CLT (Correlation) yes 𝐺 7.3 𝑜 no 1 𝑗 𝐺 𝑜 ෍ check 𝐺,Φ 𝑦 7.6 𝑗=1 yes 𝐺 6.8 Follows 2D Gaussian distribution yes 𝐺 7.3 𝑜 1 no 𝑗 𝐺 𝑜 ෍ check 𝐺,Φ 𝑦′ 7.6 𝑗=1 no 𝐺 8.2 16

  17. Obtaining a Confidence Interval for 𝜁 Joint likelihood of Likelihood of Confidence Interval Pr[ 𝐺 𝑦 ∈ Φ ] ε(𝑦, x′, Φ) for ε(𝑦, x′, Φ) Pr[ 𝐺 𝑦′ ∈ Φ ] Distribution of Gauss Gauss (correlated): D. V. Hinkley. 1969. On the Ratio of Two Correlated Normal Random Variables. Biometrika 56, 3 (1969), 635 – 639. http://www.jstor.org/stable/2334671 17

  18. How precise is our estimate? Counterexample: 9.9% ± 10% -DP vs Counterexample: 9.9% ± 2 ∙ 10 −3 -DP 18

  19. Ƹ Ƹ Step 2: Finding Counterexamples 𝜁 𝑒 𝜁 𝜁 Make Ƹ differentiable 19

  20. Ƹ How can we optimize our estimate? 1 𝑜 𝑗 𝑜 σ 𝑗=1 check 𝐺,Φ (𝑦) Not differentiable 𝜗 𝑦, 𝑦 ′ , Φ = log maximize 1 𝑜 𝑗 𝑜 σ 𝑗=1 check 𝐺,Φ (𝑦′) Goals • Make differentiable ¬𝐶 ↝ 1 − 𝐶 • Preserve semantics 𝐶 1 ∧ 𝐶 2 ↝ 𝐶 1 ∙ 𝐶 2 if 𝐶 ∶ 𝑦 = 𝐹 1 else ∶ 𝑦 = 𝐹 2 ↝ 𝑦 = 𝐶 ∙ 𝐹 1 + (1 − 𝐶) ∙ 𝐹 2 20

  21. Ƹ How can we optimize our estimate? 1 𝑜 𝑗 𝑜 σ 𝑗=1 check 𝐺,Φ (𝑦) Not differentiable 𝜗 𝑦, 𝑦 ′ , Φ = log maximize 1 𝑜 𝑗 𝑜 σ 𝑗=1 check 𝐺,Φ (𝑦′) • Maximize using SLSQP (supports hard constraints for neighborhood) • Random starting point (+ restart) • What about division by zero? • What about very small denominators? 21

  22. Main differences to Ding et al. Dimension Ding et al. This work Problem statement ε 𝑦, 𝑦 ′ , Φ > ε 0 ? Maximize ε(𝑦, 𝑦 ′ , Φ) Approach Statistical tests Estimate + confidence interval Search By patterns Gradient descent (incremental) 22

  23. Evaluation Exact solver (PSI) • How precise is the differentiable estimate? for ground truth • How efficient is DP-Finder in finding violations compared to random search? 23

  24. Ƹ Precision of Differentiable Estimate 𝜁 𝜁 𝑒 𝜁 Algorithms 24

  25. Random vs Optimized Optimized Random start 25

  26. Ƹ Ƹ Ƹ Conclusion 𝜁 -DP Counterexamples Differential Privacy ( , , ) Estimate 𝜁 Finding Counterexamples 𝜁 𝑒 𝜁 𝜁 𝜁 26

Recommend


More recommend