Privacy-preserving Mechanisms for Correlated Data Kamalika - PowerPoint PPT Presentation

Privacy-preserving Mechanisms for Correlated Data Kamalika Chaudhuri University of California, San Diego Joint work with Shuang Song and Yizhen Wang

Sensitive Data Medical Records Search Logs Social Networks

Talk Agenda: How do we analyze sensitive data while still preserving privacy ? (Focus on correlated data )

Correlated Data User information in social networks Physical Activity Monitoring

Why is Privacy Hard for Correlated Data? Because neighbor’s information leaks information on user

Talk Agenda: 1. Privacy for Correlated Data - How to define privacy (for uncorrelated data)

Differential Privacy [DMNS06] Randomized Data + Algorithm “similar” Randomized Data + Algorithm Participation of a single person does not change output

Differential Privacy: Attacker’s View Algorithm Prior Conclusion Output on + = on Knowledge Data & Algorithm Prior Conclusion Output on + = on Knowledge Data & Note: a. Algorithm could draw personal conclusions about Alice b. Alice has the agency to participate or not

What happens with correlated data?

Example 1: Activity Monitoring Goal: Share aggregate data on physical activity with doctor, while hiding activity at each specific time. Agency is at the individual level.

Example 2: Spread of Flu in Network Interaction Network Goal: Publish aggregate statistics over a set of schools, prevent adversary from knowing who has flu. Agency at school level.

Why does Correlated data require a different notion of privacy?

Example: Activity Monitoring D = (x 1 , .., x T ), x t = activity at time t Correlation Network Goal: (1) Publish activity histogram (2) Prevent adversary from knowing activity at t

Example: Activity Monitoring D = (x 1 , .., x T ), x t = activity at time t Correlation Network Goal: (1) Publish activity histogram (2) Prevent adversary from knowing activity at t Agency is at individual level, not time entry level

Example: Activity Monitoring D = (x 1 , .., x T ), x t = activity at time t Correlation Network 1-DP: Output histogram of activities + noise with stdev T Too much noise - no utility!

Example: Activity Monitoring D = (x 1 , .., x T ), x t = activity at time t Correlation Network 1-entry-DP: Output histogram of activities + noise with stdev 1 Not enough - activities across time are correlated!

Example: Activity Monitoring D = (x 1 , .., x T ), x t = activity at time t Correlation Network 1-Entry-Group DP: Output histogram of activities + noise with stdev T Too much noise - no utility!

Pufferfish Privacy [KM12] Secret Set S S: Information to be protected e.g: Alice’s age is 25, Bob has a disease

Pufferfish Privacy [KM12] Secret Pairs Secret Set S Set Q Q: Pairs of secrets we want to be indistinguishable e.g: (Alice’s age is 25, Alice’s age is 40) (Bob is in dataset, Bob is not in dataset)

Pufferfish Privacy [KM12] Secret Pairs Distribution Secret Set S Set Q Class Θ : A set of distributions that plausibly generate the data Θ e.g: (connection graph G, disease transmits w.p [0.1, 0.5]) (Markov Chain with transition matrix in set P ) May be used to model correlation in data

Pufferfish Privacy [KM12] Secret Pairs Distribution Secret Set S Set Q Class Θ An algorithm A is -Pufferfish private with parameters ✏ ( S, Q, Θ ) if for all (s i , s j ) in Q, for all , all t, θ ∈ Θ X ∼ θ , p ✓ ,A ( A ( X ) = t | s i , θ ) ≤ e ✏ · p ✓ ,A ( A ( X ) = t | s j , θ ) whenever P ( s i | θ ) , P ( s j | θ ) > 0 t p ( A ( X ) | s j , θ ) p ( A ( X ) | s i , θ )

Pufferfish “Includes” DP [KM12] Theorem: Pufferfish = Differential Privacy when: S = { s i,a := Person i has value a, for all i, all a in domain X } Q = { (s i,a s i,b ), for all i and (a, b) pairs in X x X } = { Distributions where each person i is independent } Θ

Pufferfish “Includes” DP [KM12] Theorem: Pufferfish = Differential Privacy when: S = { s i,a := Person i has value a, for all i, all a in domain X } Q = { (s i,a s i,b ), for all i and (a, b) pairs in X x X } = { Distributions where each person i is independent } Θ Theorem: No utility possible when: = { All possible distributions } Θ

Talk Agenda: 1. Privacy for Correlated Data - How to define privacy (for uncorrelated data) - How to define privacy (for correlated data) 2. Privacy Mechanisms - A General Pufferfish Mechanism

How to get Pufferfish privacy? Special case mechanisms [KM12, HMD12] Is there a more general Pufferfish mechanism for a large class of correlated data? Our work: Yes, two - a. Wasserstein Mechanism b. Markov Quilt Mechanism (Also concurrent work [GK16])

Correlation Measure: Bayesian Networks Node: variable Directed Acyclic Graph Joint distribution of variables: Y Pr( X 1 , X 2 , . . . , X n ) = Pr( X i | parents( X i )) i

A Simple Example X 1 X 2 X 3 X n Model: X i in {0, 1} State Transition Probabilities: 1 - p p 0 1 p 1 - p

A Simple Example X 1 X 2 X 3 X n Model: Pr(X 2 = 0| X 1 = 0) = p X i in {0, 1} Pr(X 2 = 0| X 1 = 1) = 1 - p State Transition Probabilities: …. 1 - p p 0 1 p 1 - p

A Simple Example X 1 X 2 X 3 X n Model: Pr(X 2 = 0| X 1 = 0) = p X i in {0, 1} Pr(X 2 = 0| X 1 = 1) = 1 - p State Transition Probabilities: …. 1 - p 2 + 1 1 Pr(X i = 0| X 1 = 0) = p 0 1 p 2(2 p − 1) i − 1 1 2 − 1 Pr(X i = 0| X 1 = 1) = 2(2 p − 1) i − 1 1 - p Influence of X 1 diminishes with distance

Algorithm: Main Idea X 1 X 2 X 3 X n Goal: Protect X 1

Algorithm: Main Idea X 1 X 2 X 3 X n Local nodes Rest (almost independent) (high correlation) Goal: Protect X 1

Algorithm: Main Idea X 1 X 2 X 3 X n Local nodes Rest (almost independent) (high correlation) Goal: Protect X 1 Add noise to hide Small correction + local nodes for rest

Measuring “Independence” Max-influence of X i on a set of nodes X R : x R log Pr( X R = x R | X i = a, θ ) e ( X R | X i ) = max a,b sup max Pr( X R = x R | X i = b, θ ) θ ∈ Θ Low e(X R |X i ) means X R is almost independent of X i To protect X i , correction term needed for X R is exp(e(X R |X i ))

How to find large “almost independent” sets Brute force search is expensive Use structural properties of the Bayesian network

Markov Blanket Markov Blanket (X i ) Markov Blanket (X i ) = X S Set of nodes X S s.t Xi is independent of X\(X i U X S ) X i given X S (usually, parents, children, other parents of children)

Define: Markov Quilt X Q is a Markov Quilt of X i if: 1. Deleting X Q breaks graph into X N and X R X i X N X Q 2. X i lies in X N 3. X R is independent of X i given X Q X R (For Markov Blanket X N = X i )

Recall: Algorithm X 1 X 2 X 3 X n Local nodes Rest (almost independent) (high correlation) Goal: Protect X 1 Add noise to hide Small correction + local nodes for rest

Why do we need Markov Quilts? Given a Markov Quilt, X N = local nodes for X i X i X Q U X R = rest X N X Q X R

Why do we need Markov Quilts? Given a Markov Quilt, X N = local nodes for X i X i X Q U X R = rest X N X Q Need to search over Markov Quilts X Q to find the one which needs optimal amount X R of noise

From Markov Quilts to Amount of Noise Let X Q = Markov Quilt for X i Stdev of noise to protect X i : X i Noise due to X N X N X Q card ( X N ) Score(X Q ) = ✏ − e ( X Q | X i ) Correction for X Q U X R X R

The Markov Quilt Mechanism For each X i Find the Markov Quilt X Q for X i with minimum score s i Output F(D) + (max i s i ) Z where Z ∼ Lap (1)

The Markov Quilt Mechanism For each X i Find the Markov Quilt X Q for X i with minimum score s i Output F(D) + (max i s i ) Z where Z ∼ Lap (1) Theorem: This preserves -Pufferfish privacy ✏ Advantage: Poly-time in special cases.

Example: Activity Monitoring D = (x 1 , .., x T ), x t = activity at time t

Example: Activity Monitoring D = (x 1 , .., x T ), x t = activity at time t X i-a X i X i+b X Q X Q X R X N (Minimal) Markov Quilts for X i have form {X i-a ,X i+b } Efficiently searchable

Example: Activity Monitoring set of states X : transition matrix describing each θ ∈ Θ P θ :

Example: Activity Monitoring set of states X : transition matrix describing each θ ∈ Θ P θ : Under some assumptions, relevant parameters are: (min prob of x under stationary distr.) π Θ = x ∈ X , θ ∈ Θ π θ ( x ) min θ ∈ Θ min { 1 − | λ | : P θ x = λ x, λ < 1 } (min eigengap of any ) P θ g Θ = min

Example: Activity Monitoring set of states X : transition matrix describing each θ ∈ Θ P θ : Under some assumptions, relevant parameters are: (min prob of x under stationary distr.) π Θ = x ∈ X , θ ∈ Θ π θ ( x ) min θ ∈ Θ min { 1 − | λ | : P θ x = λ x, λ < 1 } (min eigengap of any ) P θ g Θ = min Max-influence of X Q = {X i-a ,X i+b } for X i ✓ π Θ + exp( − g Θ b ) ◆ ✓ π Θ + exp( − g Θ a ) ◆ e ( X Q | X i ) ≤ log + 2 log π Θ − exp( − g Θ b ) π Θ − exp( − g Θ a ) a + b − 1 Score(X Q ) = ✏ − e ( X Q | X i )

Privacy-preserving Mechanisms for Correlated Data Kamalika - PowerPoint PPT Presentation

Privacy-preserving Mechanisms for Correlated Data Kamalika Chaudhuri University of California, San Diego Joint work with Shuang Song and Yizhen Wang Sensitive Data Medical Records Search Logs Social Networks Talk Agenda: How do we analyze

Privacy Preserving Protocols Workshop on Cryptography for the Internet of Things Jens Hermans KU

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

FERTILITY PRESERVING SURGERY FERTILITY PRESERVING SURGERY FERTILITY PRESERVING SURGERY FERTILITY

Preserving the Privacy of Sensitive Relationships in Graph Data Motivation Valuable Data! No

Privacy Preserving Privacy Preserving Netw ork Flow Netw ork Flow Recording Recording Bilal

Privacy preserving data mining randomized response and association rule hiding Li Xiong

CS573 Data Privacy and Security Data Privacy and Security in Healthcare Data Privacy and Security

Privacy in Wireless Networks privacy notions and metrics; privacy in RFID systems; location

$ Lesson Fourteen Consumer Privacy 04/09 privacy and information information privacy: privacy

$ Lesson Ten Consumer Privacy 04/09 privacy and information information privacy: privacy that

CS305 Topic Privacy Concept Evolution Rights to Privacy Privacy and Technologies

Privacy Protection privacy notions and metrics; privacy in RFID systems; location privacy in

Privacy & Data Governance Privacy & Data Governance Privacy & Data Governance

Flexure Mechanisms: Why? Design Principles for Precision Miniaturization Mechanisms No

Towards Privacy-Preserving Ontology Publishing F. Baader & A. Nuradiansyah Technische

New Directions in Privacy- preserving Machine Learning Kamalika Chaudhuri University of

On The Correlation between Route Dynamics and Routing Loops Ashwin Sridharan and Sue. B. Moon

Slides for Lecture 23 ENCM 501: Principles of Computer Architecture Winter 2014 Term Steve

CSE 291E / EE260C Spring 2002 Overview Overview of Tensilica Overview of XTensa

Resource Management with Makeflow & Work Queue Ben Tovar University of Notre Dame

Correlation Cube Attacks: From Weak-Key Distinguisher to Key Recovery Meicheng Liu , Jingchun

CS4980: Computational Epidemiology Sriram Pemmaraju and Alberto Maria Segre Department of

T wo-State Spin System graph G =( V , E ) 2 states {0,1} configuration : V { 0 , 1 }

Generalised Serre-Green-Naghdi equations for open channel and for natural river hydraulics

Sambuz

Useful Links

Newsletter

Mail Us

Privacy-preserving Mechanisms for Correlated Data Kamalika - PowerPoint PPT Presentation

Privacy-preserving Mechanisms for Correlated Data Kamalika Chaudhuri University of California, San Diego Joint work with Shuang Song and Yizhen Wang Sensitive Data Medical Records Search Logs Social Networks Talk Agenda: How do we analyze

Privacy Preserving Protocols Workshop on Cryptography for the Internet of Things Jens Hermans KU

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

FERTILITY PRESERVING SURGERY FERTILITY PRESERVING SURGERY FERTILITY PRESERVING SURGERY FERTILITY

Preserving the Privacy of Sensitive Relationships in Graph Data Motivation Valuable Data! No

Privacy Preserving Privacy Preserving Netw ork Flow Netw ork Flow Recording Recording Bilal

Privacy preserving data mining randomized response and association rule hiding Li Xiong

CS573 Data Privacy and Security Data Privacy and Security in Healthcare Data Privacy and Security

Privacy in Wireless Networks privacy notions and metrics; privacy in RFID systems; location

$ Lesson Fourteen Consumer Privacy 04/09 privacy and information information privacy: privacy

$ Lesson Ten Consumer Privacy 04/09 privacy and information information privacy: privacy that

CS305 Topic Privacy Concept Evolution Rights to Privacy Privacy and Technologies

Privacy Protection privacy notions and metrics; privacy in RFID systems; location privacy in

Privacy &amp; Data Governance Privacy &amp; Data Governance Privacy &amp; Data Governance

Flexure Mechanisms: Why? Design Principles for Precision Miniaturization Mechanisms No

Towards Privacy-Preserving Ontology Publishing F. Baader &amp; A. Nuradiansyah Technische

New Directions in Privacy- preserving Machine Learning Kamalika Chaudhuri University of

On The Correlation between Route Dynamics and Routing Loops Ashwin Sridharan and Sue. B. Moon

Slides for Lecture 23 ENCM 501: Principles of Computer Architecture Winter 2014 Term Steve

CSE 291E / EE260C Spring 2002 Overview Overview of Tensilica Overview of XTensa

Resource Management with Makeflow &amp; Work Queue Ben Tovar University of Notre Dame

Correlation Cube Attacks: From Weak-Key Distinguisher to Key Recovery Meicheng Liu , Jingchun

CS4980: Computational Epidemiology Sriram Pemmaraju and Alberto Maria Segre Department of

T wo-State Spin System graph G =( V , E ) 2 states {0,1} configuration : V { 0 , 1 }

Generalised Serre-Green-Naghdi equations for open channel and for natural river hydraulics

Sambuz

Useful Links

Newsletter

Mail Us

Privacy & Data Governance Privacy & Data Governance Privacy & Data Governance

Towards Privacy-Preserving Ontology Publishing F. Baader & A. Nuradiansyah Technische

Resource Management with Makeflow & Work Queue Ben Tovar University of Notre Dame