csc2412 definition of di ff erential privacy
play

CSC2412: Definition of Di ff erential Privacy Sasho Nikolov 1 An - PowerPoint PPT Presentation

CSC2412: Definition of Di ff erential Privacy Sasho Nikolov 1 An Ideal Goal The study reveals nothing new about any particular individual to an adversary. - not much Example: Adversary believes humans have four fingers on each hand. In


  1. CSC2412: Definition of Di ff erential Privacy Sasho Nikolov 1

  2. An Ideal Goal The study reveals nothing new about any particular individual to an adversary. - not much Example: • Adversary believes humans have four fingers on each hand. • In particular, believes Sasho has four fingers on each hand. 2

  3. An Ideal Goal The study reveals nothing new about any particular individual to an adversary. Example: • Adversary believes humans have four fingers on each hand. • In particular, believes Sasho has four fingers on each hand. • Study reveals distribution of number of fingers per person’s hand. • Adversary now has learned Sasho probably has five fingers per hand. 2

  4. An Ideal Goal The study reveals nothing new about any particular individual to an adversary. Example: Learning / • Adversary believes humans have four fingers on each hand. about the • In particular, believes Sasho has four fingers on each hand. world also • Study reveals distribution of number of fingers per person’s hand. : :3 earning • Adversary now has learned Sasho probably has five fingers per hand. Another example: • Adversary believes there is no link between smoking and cancer. • Also knows that Sasho smokes • Study reveals link between smoking and cancer. 2

  5. Statistical vs Personal Information In the examples, the adversary learns statistical information that pertains to Sasho. • If science works, it better reveal something about me. What information is statistical and what information is personal ? - - Test: Could the adversary have learned this information if my data were not analyzed ? vs five fingers four statistical } Yes smoking a cancer finding if ?I÷f } personal no 3

  6. Towards a Definition The algorithm doing the analysis should do almost the same in all the following cases: - • my data is included in the data set • my data is not included in the data set • my data is changed in the data set the algorithm publishes does not I. e. , what too strongly data depend my on . 4

  7. Data Model { ditty ? .at .es Data set: (multi-)set X of n data points X = { x 1 , . . . , x n } . • each data point (or row) x i is the data of one person - ¥-1 • each data point comes from a universe X binary attributes d Eg = 40,1yd . c- I Xi A data analysis algorithm (a mechanism ) is a randomized algorithm M that takes a - data set X and produces the results of the data analysis as output. The output random MIX ) is of for X any 5

  8. ↳ Almost a Definition data of differ in the a single individual We call two data sets X and X 0 neighbouring if . . - , Xu } X - ft , ' 1. ( variable n ) we can get X 0 from X by adding or removing an element - ft X , ti - g. Fun % dataset size - i . . . - xn ) 2. ( fixed n ) we can get X 0 from X by replacing an element with another - f r . . - , Xu } X - . ' =hX . - Hu } i' it ' . - , Xi - i , X " . Definition An mechanism M is di ff erentially private if, for any two neighbouring datasets X , X 0 M ( X ) ≈ M ( X 0 ) % MCH ' ) MIX and are random variables " similar " as 6

  9. Total Variation Distance Di ff erential Privacy - msatl PINCHES ) ' les ) ) ' ) ) - PINK dtullllt ) , UH - Definition An mechanism M is δ -tv di ff erentially private if, for any two neighbouring datasets - h . # t X , X 0 , and any set of outputs S X E- - - ftii . } X ' ' - in | P ( M ( X ) ∈ S ) − P ( M ( X 0 ) ∈ S ) | ≤ δ . - . neighbouring quot nee setshesufjchan.su data k What should δ be? ⇐ n ' X For any , X , there are - ipceeixyes , I earn ez } almost does 1 - • δ < 2 n ? the same for ' " ' " - X " ' . . , Hk - ' ' " ! X . yl neighbouring tot n y a1 datasets Lsat HPCUCH , . c- S ) . of - - we prof " mechanism : 1 For all i , output x , • δ ≥ 2 n ? " Name and shame - UCH ) ly ft - X ' - Ex . published , and NIH t - S not X - , prof . . , tug is ri - . neighbouring . published wreoustprof not intuitively private : some data pt ' - ft . 7 . , tf . - , # . . . .

  10. ↳ Finally, Di ff erential Privacy notes : any In Vadhana 's 2006 an adversary conclusion , Nissim , Smith draws from UCH could we , McSherry been drawn from Uct Dwork ' ) Definition An mechanism M is ε -di ff erentially private if, for any two neighbouring datasets X , X 0 , and any set of outputs S n P ( M ( X ) ∈ S ) ≤ e ε P ( M ( X 0 ) ∈ S ) . for small E HE constant positive small E P ( ell X 't c- S ) c.ee/PCdlHc- S ) " Name and shame " - something bad happens to - event that fails olefin S me this - almost My risks if used data same as for 2<0 are my any if they 8 used not are

  11. A Hypothesis Testing Viewpoint not essential Wasserman , Zhou ④ Suppose X = { X 1 , . . . , X n } are drawn IID from some distribution. The adversary A wants to use M ( X ) to test which hypothesis holds: T se - DP H 0 : X i = y 0 • E.g., “Sasho does not smoke” H 1 : X i = y 1 • E.g., “Sasho smokes” " ) " Ho " H outputs " sees Nlt ) and ( that Then for any A , , - yo - y , Xi Xi - - P ( A ( M ( X )) = ” H 1 ” | H 1 ) ≤ e ε P ( A ( M ( X )) = ” H 1 ” | H 0 ) - - True Positive rate false positive rate 1- Type II error Type I 9 error

  12. Randomized Response Warner Given smoker if is a • dataset X = { x 1 , . . . , x n } ⊆ X , x ( - { o VH E. g • query q : X → { 0 , 1 } ow - . output M ( X ) = ( Y 1 ( x 1 ) , . . . , Y n ( x n )), where, independently 8 > I e ε q ( x i ) w/ prob. < 1+ e ε 2 Y i ( x i ) = . < t 1 1 − q ( x i ) w/ prob. : 1+ e ε 10

  13. Privacy Analysis ETS for any y ∈ { 0 , 1 } n , and any neighbouring X , X 0 - # P ( M ( X ) = y ) ≤ e ε P ( M ( X 0 ) = y ) . " → ' les ) V-seso.it - e' Mutt ' ) . e ' - y ) - y ) Plait PINCH ¥ PINCHES ) . - - = fxiiriityi ! i neighbouring take ' xx some . . . - fu ) tty p ( Nlt ) . . . Yuki - y ) , Yachty . = Ply , Kil ya - - - - . - yn ) - yr ) - y , ) LPC Yu Hui . Pl Yum = Ply , hit . - - - ' ) - y ) = Ply . ix. Ii y , ) - yet - PIL Kul Pl Li Ki 's - ya ) ( Ult - - . - - . 11 -

  14. Accuracy Analysis - quit Efi ) q :D → { 0,13 - - quite + et " smoker ? Etsi ) " (1+ e ε ) Y i � 1 Want to approximate q ( X ) = 1 P n i =1 q ( x i ) . Claim: 1 P n ≈ q ( X ) i =1 e ε � 1 n n - qlt ) - I Zi ELIE , Zi ) # Hit - , uinofependentfpH-zi-EIE.tt/zt ) : 2£ Hoeffdings Inequality e.at#uetti:ZieI-eIIeeIT - - - a. 2. , ' ) n > enter ) # D - qlhlzalhexpf.IE?iiiE' Yer if if BIKE .to u login 12 22 EZ

Recommend


More recommend