formulation of privacy
play

Formulation of Privacy What information can be published? Average - PowerPoint PPT Presentation

Zhenjie Zhang Advanced Digital Sciences Center, Singapore (Thanks to Xiaokui Xiao for contributing slides) Formulation of Privacy What information can be published? Average height of US people Height of an individual Intuition:


  1. Zhenjie Zhang Advanced Digital Sciences Center, Singapore (Thanks to Xiaokui Xiao for contributing slides)

  2. Formulation of Privacy  What information can be published?  Average height of US people  Height of an individual  Intuition:  If something is insensitive to the change of any individual tuple, then it should not be considered private  Example:  Assume that we arbitrarily change the height of an individual in the US  The average height of US people would remain roughly the same  i.e., The average height reveals little information about the exact height of any particular individual

  3. 𝜻 -Differential Privacy  Definition:  Neighboring datasets: Two datasets 𝑬 and 𝑬′ , such that 𝑬′ can be obtained by changing one single tuple in 𝑬  A randomized algorithm 𝑩 satisfies 𝛇 -differential privacy, iff for any two neighboring datasets 𝑬 and 𝑬′ and for any output 𝑷 of 𝑩 , Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 Name Gender Age Diabetes Name Gender Age Diabetes Alice F 28 Y Alice F 28 Y Bob M 19 Y Bob M 19 Y Chris M 25 N Chris M 23 Y Doug M 30 N Doug M 30 N

  4. 𝜻 -Differential Privacy Probabilities ≤ exp (𝜻) ratio Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷  Intuition:  It is OK to publish information that is insensitive to changes of any particular tuple # of diabetes patients  Definition:  Neighboring datasets: Two datasets 𝑬 and 𝑬′ , such that 𝑬′ can be obtained by changing one single tuple in 𝑬  A randomized algorithm 𝑩 satisfies 𝛇 -differential privacy, iff for any two neighboring datasets 𝑬 and 𝑬′ and for any output 𝑷 of 𝑩 , Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷  The value of 𝜻 decides the degree of privacy protection

  5. Achieving 𝜻 -Differential Privacy Name Gender Age Diabetes Name Gender Age Diabetes Alice F 28 Y Alice F 28 Y Bob M 19 Y Bob M 19 Y Chris M 25 N Chris M 23 Y Doug M 30 N Doug M 30 N  It won’t work if we release the number directly:  𝑬 : the original dataset  𝑬′ : modify an arbitrary patient in 𝑬  Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷 does not hold for any 𝜻 Pr 𝑩 𝑬 = 𝒊 Pr 𝑩 𝑬′ = 𝒊′ 100% 𝒊 ′ = 𝟒 𝟏 𝟐 𝒊 = 𝟑 # of diabetes patients

  6. Achieving 𝜻 -Differential Privacy Name Gender Age Diabetes Name Gender Age Diabetes Alice F 28 Y Alice F 28 Y Bob M 19 Y Bob M 19 Y Chris M 25 N Chris M 23 Y Doug M 30 N Doug M 30 N  Idea:  Perturb the number of diabetes patients to obtain a smooth distribution Pr 𝑩 𝑬 = 𝒊 Pr 𝑩 𝑬′ = 𝒊′ 100% 𝒊 ′ = 𝟒 𝟏 𝟐 𝒊 = 𝟑 # of diabetes patients

  7. Achieving 𝜻 -Differential Privacy Name Gender Age Diabetes Name Gender Age Diabetes Alice F 28 Y Alice F 28 Y Bob M 19 Y Bob M 19 Y Chris M 25 N Chris M 23 Y Doug M 30 N Doug M 30 N  Idea:  Perturb the number of diabetes patients to obtain a smooth distribution Pr 𝑩 𝑬 = 𝒊 Pr 𝑩 𝑬′ = 𝒊′ 100% 𝒊 ′ = 𝟒 𝟏 𝟐 𝒊 = 𝟑 # of diabetes patients

  8. Achieving 𝜻 -Differential Privacy Name Gender Age Diabetes Name Gender Age Diabetes Alice F 28 Y Alice F 28 Y Bob M 19 Y Bob M 19 Y Chris M 25 N Chris M 23 Y Doug M 30 N Doug M 30 N  Idea:  Perturb the number of diabetes patients to obtain a smooth distribution ratio bounded Pr 𝑩 𝑬 = 𝒊 Pr 𝑩 𝑬′ = 𝒊′ 100% 𝒊 ′ = 𝟒 𝟏 𝟐 𝒊 = 𝟑 # of diabetes patients

  9. Laplace Distribution 𝒚  𝑞𝑒𝑔 𝒚 = exp − 2𝝁 ; 𝝁  increase/decrease 𝒚 by 1 1  𝑞𝑒𝑔 𝒚 changes by a factor of exp −  𝝁  𝝁 is referred as the scale 0.5 𝝁 = 1 0.45 𝝁 = 2 0.4 0.35 𝝁 = 4 0.3 0.25 0.2 0.15 0.1 0.05 0 -10 -8 -6 -4 -2 0 2 4 6 8 10

  10. Differential Privacy via Laplace Noise  Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy  Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷  Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁  Rationale:  𝑬 : the original dataset; # of diabetes patients = 𝒊  𝑬′ : modify a patient in 𝑬 ; # of diabetes patients = 𝒊′ ratio bounded Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients

  11. Differential Privacy via Laplace Noise  Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy  Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷  Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁  Rationale:  𝑬 : the original dataset; # of diabetes patients = 𝒊  𝑬′ : modify a patient in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬 = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊) = exp(−|𝒛 − 𝒊|/𝝁)/2𝝁 Pr 𝑩 𝑬 = 𝑷 𝒊 𝒛 # of diabetes patients

  12. Differential Privacy via Laplace Noise  Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy  Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷  Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁  Rationale:  𝑬 : the original dataset; # of diabetes patients = 𝒊  𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬′ = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊′) = exp(−|𝒛 − 𝒊′|/𝝁)/2𝝁 Pr 𝑩 𝑬′ = 𝑷 𝒛 𝒊′ # of diabetes patients

  13. Differential Privacy via Laplace Noise  Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy  Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷  Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁  Rationale:  𝑬 : the original dataset; # of diabetes patients = 𝒊  𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬′ = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊′) = exp(−|𝒛 − 𝒊′|/𝝁)/2𝝁 Pr 𝑩 𝑬 = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊) = exp(−|𝒛 − 𝒊|/𝝁)/2𝝁 Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients

  14. Differential Privacy via Laplace Noise  Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy  Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷  Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁  Rationale:  𝑬 : the original dataset; # of diabetes patients = 𝒊  𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬′ = 𝒛 Pr 𝑩 𝑬 = 𝒛 Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients

  15. Differential Privacy via Laplace Noise  Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy  Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷  Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁  Rationale:  𝑬 : the original dataset; # of diabetes patients = 𝒊  𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬 = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊 ′ ) Pr 𝑩 𝑬′ = 𝒛 𝑞𝑒𝑔(𝒛 − 𝒊) Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients

  16. Differential Privacy via Laplace Noise  Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy  Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷  Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁  Rationale:  𝑬 : the original dataset; # of diabetes patients = 𝒊  𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ Pr 𝑩 𝑬 = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊 ′ ) 𝑞𝑒𝑔(𝒛 − 𝒊) = exp(−|𝒛 − 𝒊′|/𝝁)/2𝝁 Pr 𝑩 𝑬′ = 𝒛 exp(−|𝒛 − 𝒊|/𝝁)/2𝝁 Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients

  17. Differential Privacy via Laplace Noise  Dataset: A set of patients Release # of diabetes patients with 𝜻 -differential privacy  Objective: Pr 𝑩 𝑬 = 𝑷 ≤ exp (𝜻) ∙ Pr 𝑩 𝑬′ = 𝑷  Method: Release the number + Laplace noise 𝑞𝑒𝑔 𝒚 = exp − 𝒚 2𝝁 𝝁  Rationale:  𝑬 : the original dataset; # of diabetes patients = 𝒊  𝑬′ : modify the height of an individual in 𝑬 ; # of diabetes patients = 𝒊′ 𝒊 − 𝒊 ′ Pr 𝑩 𝑬 = 𝒛 = 𝑞𝑒𝑔(𝒛 − 𝒊 ′ ) Pr 𝑩 𝑬′ = 𝒛 𝑞𝑒𝑔(𝒛 − 𝒊) ≤ exp 𝝁 Pr 𝑩 𝑬 = 𝑷 Pr 𝑩 𝑬′ = 𝑷 𝒊 𝒛 𝒊′ # of diabetes patients

Recommend


More recommend