Differen'al*Privacy:*Basics* CompSci(590.03( Instructor:(Ashwin(Machanavajjhala( Lecture*2*:*590.03*Fall*16* 1*
Outline*of*lecture* • Differen'al*Privacy* • Basic*Algorithms* – Laplace*Mechanism* • Composi'on*Theorems* • Exercise* * Lecture*2*:*590.03*Fall*16* 2*
Differen'al*Privacy* [Dwork!ICALP!2006]! For*every*pair*of*inputs* For*every*output*…* that*differ*in*one*row " D 1" D 2" O" Adversary*should*not*be*able*to*dis'nguish* between*any*D 1 *and*D 2 *based*on*any*O* ! ! !Pr[A(D 1 )!=!O]!!!! log* !!<!! ε!!!(ε>0) ! ! !Pr[A(D 2 )!=!O]!!!!!!!!!!!!!!!!.! Lecture*2*:*590.03*Fall*16* 3*
Why*pairs*of*datasets* that(differ(in(one(row ?* For*every*pair*of*inputs* For*every*output*…* that*differ*in*one*row " D 1" D 2" O" Simulate*the*presence*or*absence*of* a*single*record " Lecture*2*:*590.03*Fall*16* 4*
Why* all( pairs*of*datasets* … ?* For*every*pair*of*inputs* For*every*output*…* that*differ*in*one*row " D 1" D 2" O" Guarantee*holds*no*maTer*what*the* other*records*are. " Lecture*2*:*590.03*Fall*16* 5*
Why* all *outputs?* P [ A(D 1 ) = O 1 ] A(D 1 ) = O 1 D 1" .! .! .! D 2" P [ A(D 2 ) = O k ] Set of all outputs Lecture*2*:*590.03*Fall*16* 6*
Should not be able to distinguish whether input was D 1 or D 2 no* maTer*what*the*output Worst!discrepancy! in!probabiliGes! D 1" .! .! .! D 2" Lecture*2*:*590.03*Fall*16* 7*
Privacy*Parameter*ε* For*every*pair*of*inputs* For*every*output*…* that*differ*in*one*row " D 1" D 2" O" Pr[A(D 1 ) = O] ≤ e � Pr[A(D 2 ) = O] Controls the degree to which D 1 and D 2 can be distinguished. Smaller the � more the privacy (and better the utility) Lecture*2*:*590.03*Fall*16* 8*
Outline*of*the*Module*2* • Differen'al*Privacy* • Basic*Algorithms* – Laplace*Mechanism* • Composi'on*Theorems* Lecture*2*:*590.03*Fall*16* 9*
Can*determinis'c*algorithms*sa'sfy*differen'al*privacy?* Lecture*2*:*590.03*Fall*16* 10*
NonYtrivial*determinis'c*Algorithms*do*not* sa'sfy*differen'al*privacy* Space!of!all!inputs! Space!of!all!outputs! (at!least!2!disGnct!ouputs)! Lecture*2*:*590.03*Fall*16* 11*
NonYtrivial*determinis'c*Algorithms*do*not* sa'sfy*differen'al*privacy* Each(input(mapped(to(a(disBnct( output.( Lecture*2*:*590.03*Fall*16* 12*
There*exist*two*inputs*that*differ*in*one*entry* mapped*to*different*outputs.* Pr!>!0! Pr!=!0! Lecture*2*:*590.03*Fall*16* 13*
Random*Sampling*…* *…*also*does*not*sa'sfy*differen'al*privacy* Input* Output* D 1" D 2" O " Pr[D 1 ! O] log* !=!∞ ! Pr[D 2 ! O] = 0 implies Pr[D 2 ! O] Lecture*2*:*590.03*Fall*16* 14*
Output*Randomiza'on* Query! Database! Add!noise!to! true!answer! Researcher! • Add*noise*to*answers*such*that:* – Each*answer*does*not*leak*too*much*informa'on*about*the*database.* – Noisy*answers*are*close*to*the*original*answers.** * Lecture*2*:*590.03*Fall*16* 15*
Laplace*Mechanism* Query!q! Database! True!answer! q(D)!+!η! q(D)! Researcher! η! Privacy*depends*on* the*λ*parameter* h(η)*α*exp(Yη*/*λ)* Laplace!DistribuGon!–!Lap(λ)! 0.6! Mean:*0,** 0.4! Variance:*2*λ 2* 0.2! 0! Lecture*2*:*590.03*Fall*16* 16* P10! P8! P6! P4! P2! 0! 2! 4! 6! 8! 10!
How*much*noise*for*privacy?* * [Dwork*et*al.,*TCC*2006]* SensiGvity :*Consider*a*query*q:* I * ! *R.*S(q)*is*the*smallest*number* s.t.*for*any*neighboring*tables*D,*D’,** |*q(D)*–*q(D’)*|**≤**S(q)** * * Thm :*If* sensiGvity! of*the*query*is* S ,*then*the*following*guarantees*εY differen'al*privacy.** λ*=*S/ε* Lecture*2*:*590.03*Fall*16* 17*
Sensi'vity:*COUNT*query* D! • Number*of*people*having*disease* Disease!(Y/ N)! • Sensi'vity*=*1* Y* Y* • Solu'on:*3*+*η,** N* where*η*is*drawn*from*Lap(1/ε)* – Mean*=*0** Y* – Variance*=*2/ε 2 ** N* * N* Lecture*2*:*590.03*Fall*16* 18*
Sensi'vity:*SUM*query* • Suppose*all*values*x*are*in*[a,b]* * Sensi'vity*=*b* • Lecture*2*:*590.03*Fall*16* 19*
Privacy*of*Laplace*Mechanism* • Consider*neighboring*databases*D*and*D’* • Consider*some*output*O* * * Lecture*2*:*590.03*Fall*16* 20*
U'lity*of*Laplace*Mechanism* • Laplace*mechanism*works*for* any!funcGon! that*returns*a*real* number* • Error:*E(true*answer*–*noisy*answer) 2 ** * * ** * * *=*Var(*Lap(S(q)/ε)*)* * * ** * * *=*2*S(q) 2 */*ε 2 * Lecture*2*:*590.03*Fall*16* 21*
Outline*of*the*Module*2* • Differen'al*Privacy* • Basic*Algorithms* – Laplace*&*Exponen'al*Mechanism* – Randomized*Response* • Composi'on*Theorems* * Lecture*2*:*590.03*Fall*16* 22*
Why*Composi'on?** • Reasoning*about*privacy*of** a*complex*algorithm*is*hard.** • Helps*sotware*design* – If*building*blocks*are*proven*to*be*private,*it*would*be*easy*to*reason* about*privacy*of*a*complex*algorithm*built*en'rely*using*these*building* blocks.* Lecture*2*:*590.03*Fall*16* 23*
A*bound*on*the*number*of*queries* • In*order*to*ensure*u'lity,*a*sta's'cal*database*must*leak*some* informa'on*about*each*individual** • We*can*only*hope*to*bound*the** amount*of*disclosure* • Hence,*there*is*a*limit*on*number*of** queries*that*can*be*answered* Lecture*2*:*590.03*Fall*16* 24*
Dinur*Nissim*Result* • A*vast*majority*of*records*in*a*database*of*size* n( can*be* [DinurPNissim!PODS!2003]! reconstructed*when* n( log( n ) 2 *queries*are*answered*by*a* sta's'cal*database*…* * …*even*if*each*answer*has*been*arbitrarily*altered*to*have*up*to* o (√ n )*error* .** Lecture*2*:*590.03*Fall*16* 25*
Sequen'al*Composi'on* • If*M 1 ,*M 2 ,*...,*M k ( are*algorithms*that*access*a*private*database*D* such*that*each*M i(( sa'sfies*ε i ( Ydifferen'al*privacy,** * then*the*combina'on*of*their*outputs*sa'sfies** εYdifferen'al*privacy*withε=ε 1 +...+ε k ** Lecture*2*:*590.03*Fall*16* 26*
Privacy*as*Constrained*Op'miza'on* • Three*axes* – Privacy** – Error* – Queries*that*can*be*answered* • E.g.:*Given*a*fixed*set*of*queries*and* privacy!budget! ε,*what*is*the* minimum*error*that*can*be*achieved?** Lecture*2*:*590.03*Fall*16* 27*
Parallel*Composi'on* • If*M 1 ,*M 2 ,*...,*M k ( are*algorithms*that*access*disjoint*databases*D 1 ,* D 2 ,*…,*D k *such*that*each*M i(( sa'sfies*ε i ( Ydifferen'al*privacy,** * then*the*combina'on*of*their*outputs*sa'sfies** εYdifferen'al*privacy*withε=*max{ε 1 ,...,ε k }* Lecture*2*:*590.03*Fall*16* 28*
Postprocessing* • If*M 1 *is*an*εdifferen'ally*private*algorithm*that*accesses*a*private* database*D,** * then*outpu{ng*M 2 (M 1 (D))*also*sa'sfies*εYdifferen'al*privacy.* Lecture*2*:*590.03*Fall*16* 29*
Case*Study:*KYmeans*Clustering* Lecture*2*:*590.03*Fall*16* 30*
Kmeans* • Par''on*a*set*of*points*x 1 ,*x 2 ,*…,*x n *into*k*clusters*S 1 ,*S 2 ,*…,*S k* such* that*the*following*is*minimized:** ! ! ! ! − ! ! ! ! ! ! ! ! ! ∈ ! ! ! Mean*of*the*cluster*S i* Lecture*2*:*590.03*Fall*16* 31*
Kmeans* Algorithm:** • Ini'alize*a*set*of*k*centers* • Repeat* *Assign*each*point*to*its*nearest*center* *Recompute*the*set*of*centers* Un'l*convergence*…** • Output*the*final*k*centers* Lecture*2*:*590.03*Fall*16* 32*
Exercise** • What*is*a*differen'ally*private*algorithm*for*releasing*a*kYmeans* clustering*(i.e.,*outpu{ng*the*final*set*of*k*centers)?** Lecture*2*:*590.03*Fall*16* 33*
Summary* • Differen'ally*private*algorithms*ensure*an*aTacker*can’t*infer*the* presence*or*absence*of*a*single*record*in*the*input*based*on*any* output.** • Building*blocks** – Laplace*mechanism** • Composi'on*rules*help*build*complex*algorithms*using*building* blocks* Lecture*2*:*590.03*Fall*16* 34*
Recommend
More recommend