Zhenjie Zhang Advanced Digital Sciences Center, Singapore (Thanks to Xiaokui Xiao for contributing slides)
Formulation of Privacy What information can be published? Average height of US people Height of an individual Intuition: If something is insensitive to the change of any individual tuple, then it should not be considered private Example: Assume that we arbitrarily change the height of an individual in the US The average height of US people would remain roughly the same i.e., The average height reveals little information about the exact height of any particular individual
𝜻 -Differential Privacy Definition: Neighboring datasets: Two datasets 𝑬 and 𝑬′ , such that 𝑬′ can be obtained by changing one single tuple in 𝑬 A randomized algorithm 𝑩 satisfies 𝛇 -differential privacy, iff for any two neighboring datasets 𝑬 and 𝑬′ and for any output 𝑷 of 𝑩 , Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 Name Gender Age Diabetes Name Gender Age Diabetes Alice F 28 Y Alice F 28 Y Bob M 19 Y Bob M 19 Y Chris M 25 N Chris M 23 Y Doug M 30 N Doug M 30 N
𝜻 -Differential Privacy Probabilities ≤ exp (𝜻) ratio Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 Intuition: It is OK to publish information that is insensitive to changes of any particular tuple # of diabetes patients Definition: Neighboring datasets: Two datasets 𝑬 and 𝑬′ , such that 𝑬′ can be obtained by changing one single tuple in 𝑬 A randomized algorithm 𝑩 satisfies 𝛇 -differential privacy, iff for any two neighboring datasets 𝑬 and 𝑬′ and for any output 𝑷 of 𝑩 , Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 The value of 𝜻 decides the degree of privacy protection
Achieving 𝜻 -Differential Privacy Name Gender Age Diabetes Name Gender Age Diabetes Alice F 28 Y Alice F 28 Y Bob M 19 Y Bob M 19 Y Chris M 25 N Chris M 23 Y Doug M 30 N Doug M 30 N It won’t work if we release the number directly: 𝑬 : the original dataset 𝑬′ : modify an arbitrary patient in 𝑬 Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 does not hold for any 𝜻 Pr 𝑩 𝑬 = 𝒊 Pr 𝑩 𝑬′ = 𝒊′ 100% 𝒊 ′ = 𝟒 𝟏 𝟐 𝒊 = 𝟑 # of diabetes patients
Achieving 𝜻 -Differential Privacy Name Gender Age Diabetes Name Gender Age Diabetes Alice F 28 Y Alice F 28 Y Bob M 19 Y Bob M 19 Y Chris M 25 N Chris M 23 Y Doug M 30 N Doug M 30 N Idea: Perturb the number of diabetes patients to obtain a smooth distribution Pr 𝑩 𝑬 = 𝒊 Pr 𝑩 𝑬′ = 𝒊′ 100% 𝒊 ′ = 𝟒 𝟏 𝟐 𝒊 = 𝟑 # of diabetes patients
Achieving 𝜻 -Differential Privacy Name Gender Age Diabetes Name Gender Age Diabetes Alice F 28 Y Alice F 28 Y Bob M 19 Y Bob M 19 Y Chris M 25 N Chris M 23 Y Doug M 30 N Doug M 30 N Idea: Perturb the number of diabetes patients to obtain a smooth distribution Pr 𝑩 𝑬 = 𝒊 Pr 𝑩 𝑬′ = 𝒊′ 100% 𝒊 ′ = 𝟒 𝟏 𝟐 𝒊 = 𝟑 # of diabetes patients
Achieving 𝜻 -Differential Privacy Name Gender Age Diabetes Name Gender Age Diabetes Alice F 28 Y Alice F 28 Y Bob M 19 Y Bob M 19 Y Chris M 25 N Chris M 23 Y Doug M 30 N Doug M 30 N Idea: Perturb the number of diabetes patients to obtain a smooth distribution ratio bounded Pr 𝑩 𝑬 = 𝒊 Pr 𝑩 𝑬′ = 𝒊′ 100% 𝒊 ′ = 𝟒 𝟏 𝟐 𝒊 = 𝟑 # of diabetes patients
Laplace Distribution 𝒚 𝑞𝑒𝑔 𝒚 = exp − 2𝝁 ; 𝝁 increase/decrease 𝒚 by 1 1 𝑞𝑒𝑔 𝒚 changes by a factor of exp − 𝝁 𝝁 is referred as the scale 0.5 𝝁 = 1 0.45 𝝁 = 2 0.4 0.35 𝝁 = 4 0.3 0.25 0.2 0.15 0.1 0.05 0 -10 -8 -6 -4 -2 0 2 4 6 8 10
Differential Privacy via Laplace Noise Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁 Rationale: 𝑬 : the original dataset; # of diabetes patients = 𝒊 𝑬′ : modify a patient in 𝑬 ; # of diabetes patients = 𝒊′ ratio bounded Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients
Differential Privacy via Laplace Noise Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁 Rationale: 𝑬 : the original dataset; # of diabetes patients = 𝒊 𝑬′ : modify a patient in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬 = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊) = exp(−|𝒛 − 𝒊|/𝝁)/2𝝁 Pr 𝑩 𝑬 = 𝑷 𝒊 𝒛 # of diabetes patients
Differential Privacy via Laplace Noise Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁 Rationale: 𝑬 : the original dataset; # of diabetes patients = 𝒊 𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬′ = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊′) = exp(−|𝒛 − 𝒊′|/𝝁)/2𝝁 Pr 𝑩 𝑬′ = 𝑷 𝒛 𝒊′ # of diabetes patients
Differential Privacy via Laplace Noise Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁 Rationale: 𝑬 : the original dataset; # of diabetes patients = 𝒊 𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬′ = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊′) = exp(−|𝒛 − 𝒊′|/𝝁)/2𝝁 Pr 𝑩 𝑬 = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊) = exp(−|𝒛 − 𝒊|/𝝁)/2𝝁 Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients
Differential Privacy via Laplace Noise Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁 Rationale: 𝑬 : the original dataset; # of diabetes patients = 𝒊 𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬′ = 𝒛 Pr 𝑩 𝑬 = 𝒛 Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients
Differential Privacy via Laplace Noise Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁 Rationale: 𝑬 : the original dataset; # of diabetes patients = 𝒊 𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬 = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊 ′ ) Pr 𝑩 𝑬′ = 𝒛 𝑞𝑒𝑔(𝒛 − 𝒊) Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients
Differential Privacy via Laplace Noise Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁 Rationale: 𝑬 : the original dataset; # of diabetes patients = 𝒊 𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬 = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊 ′ ) 𝑞𝑒𝑔(𝒛 − 𝒊) = exp(−|𝒛 − 𝒊′|/𝝁)/2𝝁 Pr 𝑩 𝑬′ = 𝒛 exp(−|𝒛 − 𝒊|/𝝁)/2𝝁 Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients
Differential Privacy via Laplace Noise Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁 Rationale: 𝑬 : the original dataset; # of diabetes patients = 𝒊 𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ 𝒊 − 𝒊 ′ Pr 𝑩 𝑬 = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊 ′ ) Pr 𝑩 𝑬′ = 𝒛 𝑞𝑒𝑔(𝒛 − 𝒊) ≤ exp 𝝁 Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients
Recommend
More recommend