adversarial classification under differential privacy
play

Adversarial Classification Under Differential Privacy Jairo Giraldo - PowerPoint PPT Presentation

Adversarial Classification Under Differential Privacy Jairo Giraldo Alvaro A. Cardenas Murat Kantarcioglu Jonathan Katz University of Utah UC Santa Cruz UT Dallas GMU 20th Century: computers were brains without senses-they only


  1. Adversarial Classification Under Differential Privacy Jairo Giraldo Alvaro A. Cardenas Murat Kantarcioglu Jonathan Katz University of Utah UC Santa Cruz UT Dallas GMU

  2. • 20th Century: computers were brains without senses—-they only knew what we told them. • More info in the world than what people can type on keyboard • 21st century: computers sense things, e.g., GPS we take for Kevin Ashton (British granted in our phones entrepreneur) coined the term IoT in 1999. � 2

  3. New Privacy Concerns � 3

  4. In Addition to Privacy, There is Another Problem: Data Trustworthiness � 4

  5. We Need to Provide 3 Properties 1. Classical Utility • Usable Statistics Utility • Reason for data collection Privacy vs. Utility 2. Privacy • Protect consumer data Privacy 3. Security This work • Trustworthy data Security • Detect data poisoning • Different from classical utility because this is an adversarial setting � 5

  6. New Adversary Model Database • Consumer data Query 푑 1 Response protected by Differential 푑 2 Privacy (DP) DP • Classical adversary in 푑푛 DP is curious • Our adversary is DP Sensor 1 different: data Sensor 2 poisoning by hiding their attacks in DP Sensor 3 noise Sensor n DP • Global and local DP � 6

  7. Adversary Goals • Intelligently poison the data in a way that is hard to detect (hide attack in DP noise) • Achieve maximum damage to the utility of the system (deviate estimate as much as possible) Attack Goals: Classical DP Multi-criteria Optimization f a E [ Y a ] max ¯ Y ← M ( D ) ¯ Y ∼ f 0 s.t. Attack D KL ( f a k f 0 )  γ Y a instead of ¯ Y f a 2 F � 7

  8. Functional Optimization Problem • We have to find a probability distribution • A probability density function f a • Among all possible continuous functions as long as Z f a ( r ) dr = 1 r ∈ Ω • What is the shape of ? f a � 8

  9. Solution: Variational Methods • Variational methods are a useful tool to find the shape of functions or the structure of matrices • They replace the function or matrix optimization problem with a parameterized perturbation of the function or matrix • We can then optimize with respect to the parameter to find the “shape” of the function/ matrix • The Lagrange multipliers give us the final parameters of the function � 9

  10. Solution Maximize Z rf a ( r ) dr Auxiliary Function r ∈ Ω ✓ f a ( r ) ◆ Z Subject to: q ( r, α ) = f ∗ a ( r ) + α p ( r ) . f a ( r ) ln dr ≤ γ . f 0 ( r ) r ∈ Ω Z f a ( r ) dr = 1 . r ∈ Ω Lagrangian: 0 1 0 1 Z Z q ( r, α ) ln q ( r, α ) Z A + κ 2 L ( α ) = rq ( r, α ) dr + κ 1 q ( r, α ) dr − 1 f 0 ( r ) dr − γ @ @ A r ∈ Ω r ∈ Ω r ∈ Ω Solution: y f 0 ( y ) e κ 1 a ( y ) = f ∗ , where κ 1 is the solution to D KL ( f ∗ a k f 0 ) = γ . r κ 1 dr R f 0 ( r ) e � 10

  11. Least-Favorable Laplace Attack 2.3 2.4 0.25 User ID Data = 0 2.2 = 0.1 User 1 0.5 0.2 = 2 2.9 Diff. 2.5 User 2 0.3 Aggregation Probability 0.15 Privacy 2.7 User 3 0.7 2.8 0.1 2.4 User 4 1 2.6 0.05 0 Database Possible Possible Query -10 -5 0 5 10 15 20 25 30 private compromised DP Aggregation response response response a ( y ) = κ 2 1 − b 2 f 0 ( y ) = 1 e − | y − θ | + ( y − θ ) 2 be − | y − θ | /b f ∗ b κ 1 2 b κ 2 1 κ 1 is the solution to 1 − b 2 + ln(1 − b 2 2 b 2 ) = γ κ 2 κ 2 � 11 1

  12. Example: Traffic Flow Estimation We use loop detection data from California - Vehicle count - Occupancy � 12

  13. Classical Bad Data Detection in Traffic Flow Estimation DP BDD Prediction Sensor Readings DP BDD TMC ✓ l i − 1 y i ( k ) + T F in y i ( k + 1) = ˆ ˆ i ( k ) Cabinet l i l i − F out � ( k ) + Q i ( y i ( k ) − ˆ y i ( k )) i λ i − 1 = 3 Loop detector L i +1 F in F out i ( k ) ( k ) i � 13 Cell i − 1 Cell i Cell i + 1

  14. The Attack Can Hide in DP Noise and Cause a Larger Impact With DP, the attacker can Without DP the lie more without attack is limited detection Can we do better? � 14

  15. Defense Against Adversarial (Adaptive) Distributions • Player 1 designs classifier D ∈ S minimize Φ (D,A) (e.g., Pr[Miss Detection] Subject to fix false alarms) - Player 1 makes the first move • Player 2 (attacker) has multiple strategies A ∈ F - Makes the move after observing the move of the classifier • Player 1 wants provable performance guarantees: - Once it selects D o by minimizing Φ , it wants proof that no matter what the attacker does, Φ <m, i.e. • � 15

  16. Defense in Traffic Case Proposed new defense as game between attacker and defender: • With classical defense • With our defense � 16

  17. Another Example: Sharing Electricity Consumption 100 =0.03 and BDD =0.02 and BDD 80 =0.01 and B D D Impact S (MW) =0.03 and DP-BDD 60 =0.02 and DP-BDD =0.01 and DP-BDD 40 20 0 10 -2 10 -1 10 0 � 17 Level of privacy ( )

  18. Conclusions • Growing number of applications where we need to provide utility, privacy, and security • In particular, adversarial classification under differential privacy • Various possible extensions • Different quantification of privacy loss (e.g., Rényi DP) • Adversary models (noiseless privacy), etc. • Related work on DP and adversarial ML • Certified robustness � 18

  19. Strategic Adversary + Defender • Player 1 designs classifier D ∈ S minimizing Φ (D,A) (e.g., Pr[Error]) – Defender makes the first move • Player 2 (attacker) has multiple strategies A ∈ F – Attacker makes the move after observing the move of the classifier • Player 1 wants provable performance guarantees: – Once it selects D o by minimizing Φ , it wants proof that no matter what the attacker does, Φ <m, i.e. � 19

  20. Strategy: Solve maximin and Show Solution is equal to minimax – For any finite, zero sum-game: – Minimax = Maximin = Nash Equilibrium (saddle point) � 20

  21. Sequential Hypothesis Testing • Sequence of random variables X 1 ,X 2 ,... – Honest sensors have X 1 ,X 2 ,...,X i distributed as f 0 (X 1 ,X 2 ,...,X i ) (Defined by DP) – Tampered sensor has X 1 , X 2 ,...,X i distributed as f 1 (X 1 , X 2 ,…,X i ) (note that f 1 is unknown) • Collect enough samples i until we have enough information to make a decision! – D =( N , d N ) where N=stopping time, d N =decision � 21

  22. Sequential Probability Ratio Test (SPRT) The solution of this problem is the SPRT: S n = ln f 1 ( x 1 , ..., x n ) f 0 ( x 1 , ...x n ) H 1 U S n Undecided H 0 L time � 22

Recommend


More recommend