New Directions in Privacy- preserving Machine Learning Kamalika Chaudhuri University of California, San Diego
Sensitive Data Medical Records Genetic Data Search Logs
AOL Violates Privacy
AOL Violates Privacy
Netflix Violates Privacy [NS08] Movies% User%1% User%2% User%3% 2-8 movie-ratings and dates for Alice reveals: Whether Alice is in the dataset or not Alice’s other movie ratings
High-dimensional Data is Unique Example: UCSD Employee Salary Table Position Department Gender Ethnicity Salary - Faculty Female CSE SE Asian One employee (Kamalika) fits description!
Simply anonymizing data is unsafe!
Disease Association Studies [WLWTZ09] Cancer Healthy Correlations Correlations Correlation (R 2 values), Alice’s DNA reveals: If Alice is in the Cancer set or Healthy set
Simply anonymizing data is unsafe! Statistics on small data sets is unsafe! Privacy Accuracy Data Size
Correlated Data User information in social networks Physical Activity Monitoring
Why is Privacy Hard for Correlated Data? Neighbor’s information leaks information on user
Talk Agenda: How do we learn from sensitive data while still preserving privacy ? New Directions: 1. Privacy-preserving Bayesian Learning 2. Privacy-preserving statistics on correlated data
Talk Agenda: 1. Privacy for Uncorrelated Data - How to define privacy
Differential Privacy [DMNS06] Randomized Data + Algorithm “similar” Randomized Data + Algorithm Participation of a single person does not change output
Differential Privacy: Attacker’s View Algorithm Prior Conclusion Output on + = Knowledge on Data & Algorithm Prior Conclusion Output on + = Knowledge on Data &
Differential Privacy [DMNS06] S D 1 D 2 Pr[A(D 1 ) in S] Pr[A(D 2 ) in S] For all D 1 , D 2 that differ in one person’s value, any set S, If A = -private randomized algorithm, then: ✏ Pr( A ( D 1 ) ∈ S ) ≤ e ✏ Pr( A ( D 2 ) ∈ S )
Differential Privacy 1. Provably strong notion of privacy 2. Good approximations for many functions e.g, means, histograms, etc.
Interpretation: Attacker’s Hypothesis Test [WZ10, OV13] H0: Input to the algorithm = Data + H1: Input to the algorithm = Data + Failure Events: False Alarm (FA), Missed Detection (MD)
Interpretation: Attacker’s Hypothesis Test [WZ10, OV13] (0, 1) If algorithm is ✏ -DP Pr( FA ) + e ✏ Pr( MD ) ≥ 1 e ✏ Pr( FA ) + Pr( MD ) ≥ 1 ✓ ◆ 1 1 1 + e ✏ , 1 + e ✏ FA = False Alarm MD = Missed Detection (1, 0)
Talk Agenda: 1. Privacy for Uncorrelated Data - How to define privacy - Privacy-preserving Learning
Example 1: Flu Test Predicts flu or not, based on patient symptoms Trained on sensitive patient data
Example 2: Clustering Abortion Data Given data on abortion locations, cluster by location while preserving privacy of individuals
Bayesian Learning
Bayesian Learning } Data X = { x 1 , x 2 , … } Related through Model Class Θ likelihood p ( x | θ )
Bayesian Learning } Data X = { x 1 , x 2 , … } Related through Model Class Θ likelihood p ( x | θ ) + Prior π ( θ )
Bayesian Learning } Data X = { x 1 , x 2 , … } Related through Model Class Θ likelihood p ( x | θ ) + Prior π ( θ ) Data X
Bayesian Learning } Data X = { x 1 , x 2 , … } Related through Model Class Θ likelihood p ( x | θ ) = + Prior π ( θ ) Data X Posterior p ( θ | X )
Bayesian Learning } Data X = { x 1 , x 2 , … } Related through Model Class Θ likelihood p ( x | θ ) = + Prior π ( θ ) Data X Posterior p ( θ | X ) Goal: Output posterior (approx. or samples)
Example: Coin tosses X = { H, T, H, H… } likelihood: Θ = [0, 1] p ( x | θ ) = θ x (1 − θ ) 1 − x
Example: Coin tosses X = { H, T, H, H… } likelihood: Θ = [0, 1] p ( x | θ ) = θ x (1 − θ ) 1 − x + Prior π ( θ ) = 1
Example: Coin tosses X = { H, T, H, H… } likelihood: Θ = [0, 1] p ( x | θ ) = θ x (1 − θ ) 1 − x + Prior Data X π ( θ ) = 1 (h H, t T)
Example: Coin tosses X = { H, T, H, H… } likelihood: Θ = [0, 1] p ( x | θ ) = θ x (1 − θ ) 1 − x = + Posterior Prior Data X π ( θ ) = 1 p ( θ | x ) ∝ θ h (1 − θ ) t (h H, t T)
Example: Coin tosses X = { H, T, H, H… } likelihood: Θ = [0, 1] p ( x | θ ) = θ x (1 − θ ) 1 − x = + Posterior Prior Data X π ( θ ) = 1 p ( θ | x ) ∝ θ h (1 − θ ) t (h H, t T) In general, is more complex (classifiers, etc) θ
Private Bayesian Learning } Data X = { x 1 , x 2 , … } Related through Model Class Θ likelihood p ( x | θ ) = + Prior π ( θ ) Data X Posterior p ( θ | X )
Private Bayesian Learning } Data X = { x 1 , x 2 , … } Related through Model Class Θ likelihood p ( x | θ ) = + Prior π ( θ ) Data X Posterior p ( θ | X ) Goal: Output private approx. to posterior
How to make posterior private? Option 1: Direct posterior sampling [Detal14] Not private unless under restrictive conditions p ( θ | D ) p ( θ | D 0 )
How to make posterior private? Option 2: Sample from truncated posterior at high temperature [WFS15] Disadvantage: Intractable - technically privacy only on convergence Needs more data/subjects
Our Work: Exponential Families Exponential family distributions: p ( x | θ ) = h ( x ) e θ > T ( x ) − A ( θ ) where T is a sufficient statistic Includes many common distributions like Gaussians, Binomials, Dirichlets, Betas, etc
Properties of Exponential Families Exponential families have conjugate priors = + Prior π ( θ ) Data X Posterior p ( θ | X ) is in the same distribution class as π ( θ ) p ( θ | x ) eg, Gaussians-Gaussians, Beta-Binomial, etc
Sampling from Exponential Families (Non-private) posterior comes from exp. family: p ( θ | x ) ∝ e η ( θ ) > ( P i T ( x i )) − B ( θ ) given data x 1 , x 2 , … Private Sampling: 1. If T is bounded, add noise to to get private X T ( x i ) version T’ i 2. Sample from the perturbed posterior: p ( θ | x ) ∝ e η ( θ ) > T 0 − B ( θ )
Performance • Theoretical Guarantees • Experiments
Theoretical Guarantees Performance Measure: Asymptotic Relative Efficiency (Lower = more sample efficient for large n) Non-private: 2 Our Method: 2 [WFS15]: max(2 , 1 + 1 / ✏ )
Experiments - Task Task: Time series clustering of events in Wikileaks war logs while preserving event-level privacy Data: War-log entries - Afghanistan (75K), Iraq (390K) Goal: Cluster entries in each region based on features (casualty counts, enemy/friendly fire, explosive hazards, etc…)
Experiments - Model Hidden Markov Model for each region h t … Hidden state Observed features … x t Discrete states (h t ) and observations (x t ) Transition parameters T: T ij = P(h t+1 = i | h t = j) Emission parameters O: where O ij = P(x t = i | h t = j) Goal: Sample from posterior P(O| data) (in the exponential family)
Experiments - Results 4 5 − 3.5 x 10 − 1.5 x 10 − 4 − 2 − 4.5 Non − private HMM − 2.5 Non − private naive Bayes − 5 Test − set log − likelihood Test − set log − likelihood Laplace mechanism HMM − 5.5 OPS HMM (truncation multiplier = 100) − 3 − 6 − 3.5 − 6.5 − 7 − 4 Non − private HMM − 7.5 Non − private naive Bayes − 4.5 − 8 Laplace mechanism HMM OPS HMM (truncation multiplier = 100) − 5 − 8.5 − 1 0 1 − 1 0 1 10 10 10 10 10 10 Epsilon (total) Epsilon (total) Afghanistan Iraq
State 2 Iraq State 1 Iraq 0.05 0.15 0.25 0.05 0.15 0.25 0.1 0.2 0.3 0.1 0.2 0 0 criminal event criminal event enemy action enemy action explosive hazard explosive hazard friendly action friendly action Experiments - States friendly fire friendly fire non − combat event combat event other other suspicious incident suspicious incident threat report threat report 0.005 0.015 0.005 0.015 0.025 0.01 0.02 0.01 0.02 0 0 cache found/cleared ied explosion ied found/cleared direct fire ied explosion ied found/cleared direct fire murder detain indirect fire escalation of force detain indirect fire search and attack small arms threat cache found/cleared raid raid murder counter mortar patrol 0.05 0.15 0.25 0.05 0.15 0.1 0.2 0.1 0.2 0 0 friendly and host casualties friendly and host casualties civilian casualties civilian casualties enemy casualties enemy casualties
Experiments - Clustering MND − BAGHDAD State 2 MND − C Region code MND − N MND − SE State 1 MNF − W Jan 2004 Jan 2005 Jan 2006 Surge announced Jan 2008 Peak troops Month
Conclusion New method for private posterior sampling from exponential families Open Problems: 1. Private sampling from more complex posteriors 2. Private versions of other Bayesian posterior approximation schemes (variational Bayes, etc) 3. Combining Bayesian inference with more relaxed forms of DP (eg, concentrated DP , distributional DP , etc)
Talk Agenda: 1. Privacy for Uncorrelated Data - How to define privacy - Privacy-preserving Bayesian Learning 2. Privacy for Correlated Data
Recommend
More recommend