csc2412 exponential mechanism private pac learning
play

CSC2412: Exponential Mechanism & Private PAC Learning Sasho - PowerPoint PPT Presentation

CSC2412: Exponential Mechanism & Private PAC Learning Sasho Nikolov 1 Classification Basics The learning problem - I + I . Problem: develop an algorithm that classifies avocados into ripe and unripe. We have a big data set of avocado data.


  1. CSC2412: Exponential Mechanism & Private PAC Learning Sasho Nikolov 1

  2. Classification Basics

  3. The learning problem - I + I . Problem: develop an algorithm that classifies avocados into ripe and unripe. We have a big data set of avocado data. For each avocado, we have: features • colour, firmness, size, shape, skin texture, . . . label • ripe or not From this data, we want to classify unseen avocados. predict rule to Learn a label features from 2

  4. ⇐ The learning problem, formally all possible setting Model : to the features id • Known data universe X and an unknown probability distribution D on X - - A "YYel :# Facility ) : • Known concept class C and an unknown concept c 2 C to my - - # rules "plIE .fi cec All allow can Iiia 't done all effete some features to label features produce . QQ.e.ge • We get a dataset X = { ( x 1 , c ( x 1 ) , . . . , ( x n , c ( x n ) }} , where each x i is an , independent sample from D . features - / iff - g avocado E is ripe an Goal : Learn c from X . . colour = Attu ye approximation of e firmness and 3 = medium

  5. The goal, formally the of Fraction labeled population The error of a concept c 0 2 C is incorrectly by µ ' c ↳ HWA L D , c = P x ⇠ D ( c 0 ( x ) 6 = c ( x )) . ' ( w Floss of c ) . r . t D c , sampled . iid . . ,Kn , CCN ) - tix . . . CHD X on input from D , . - We want an algorithm M that outputs some c 0 2 C and satisfies Ed fraction mis classifies ^ → output of UK ) of the population P ( L D , c ( M ( X ))  α ) � 1 � β . in choosing - % taken - X randomness over of µ and any randomness [ Valiant ] Probably Approximately Correct ( PAC ) learning 4

  6. ⇒ Empirical risk minimization Issue : We want to find arg min c 0 2 C L D , c ( c 0 ), but we do not know D , c . pnpnoxiuate minimizer ok Solution : Instead we solve arg min c 0 2 C L X ( c 0 ) , where Population LD ,e L X ( c 0 ) = |{ i : c 0 ( x i ) 6 = c ( x i ) }| cuukuol ↳ c4=O :L , n I fraction of in X pts empirical is the empirical error. loss misclassified by Lx ' c ckuowil . loss Theorem (Uniform convergence) close → Pop and emp c- C for Kc ' are p Suppose that n � ln( | C | / β ) . Then, with probability � 1 � β , 2 α 2 ' EE1Yt 'hHfhH' WHY uoeetoung tic :{ co , c 0 2 C L X ( c 0 ) � L D , c ( c 0 )  α . max c. Lxcilth for Other versions C infinite - dimension VC 5 , e.g ,

  7. Private learning In private PAC learning, we require that unzip proximately • when X is a sample of iid labeled data points, we learn the correct concept, as in - stadard PAC learning; arin put X outputs C Privacy T ele must • the learning algorithm is ε -di ff erentially private for any labeled data set hold even - - X 2 ( X ⇥ C ) n . if data X - C iid neighbouring t s not ' ' f X. x c- S ) ' ) . PINK e- S ) ee PINCH t 6

  8. ↳ Want to ERM do DP e we - ( approximately ) minimize i. e. # c' exit } I I i Citi ) : ( c ' ) # = C- C over ' c mechanism ' How Laplace noise can we use ? for this a counting query : analyze ↳ co ) is Exercise this f could all we release counting queries to answers { 1 × 14 . ↳ 141 } . ) , 4h27 , C- 94 . . , eh } . - . ,

  9. Exponential mechanism

  10. Private ERM We want to solve arg min c 0 2 C L X ( c 0 ). or How do we minimize with di ff erential privacy? Sample concepts with less error with higher probability � ε n ⇣ ⌘ P ( M ( X ) = c 0 ) / exp D 2 L X ( c 0 ) pto portion at to 7

  11. Exponential Mechanism " how good of , y ) output space u ( X = Ban is y output General set-up : score function u : X ⇥ Y ! R an : given Goal X UH , y ) , find for the dataset argq.mg X ? Sensitivity " X ⇠ X 0 | u ( X , y ) � u ( X 0 , y ) | . ∆ u = max y 2 Y max be the How different score between neighbouring can ' X , X The mechanism M exp ( X ) which outputs a random Y so that e ε u ( X , y ) / 2 ∆ u → normalizing P ( Y = y ) = factor z 2 Y e ε u ( X , z ) / 2 ∆ u P is ε -di ff erentially private - 8

  12. Privacy analysis - y ) X - t ' PINCH H to show Enough - e ' : tye 's ' . ) p¥=y - - , .e*eu"¥:"#t¥÷::÷÷: in ::÷÷ E eEK.EE/Z--eE • 9

  13. Accuracy of the exponential mechanism output y* ZL achieves OPTLXIEECOPTCX OPT ( X ) = max y 2 Y u ( X , y ) - t )/2gy Then, for the output Y = M exp ( X ), ) Eeg eeuk.at/2aaoHy:UCt,y ) P ( u ( X , Y )  OPT ( X ) � t )  - - tf a- OPT ee 'd thou z - t ) KDU . ( left - l ) OPT eel ⇐ e- ethos ? s e¥sq- 191 - Ethan peg ) s e 10 .

  14. Private Learning

  15. known R distribution D Unknown ou Unknown C known concept class c in a - - 41K , um } Data set X where , chill . - , Hm is - ii. D , Xu , , . . c' Ch ) / ↳ ( c . - app . Ki:cHniHdHi# ' ) . ) ( cent LD , etc . lull clip ) I t - p , we prob z If then n z ← ! Lo ' ) e- ↳ kilt d Kcie C Cc , , - Erath mechanism we prob ye 9 sample : to expleulkylb.su ) . proportional - jug flag , lucky ) - ult . g) I ' Au -

  16. Putting things together EEE A concept class C can be learned by an eps -di ff erentially private mechanism when the sample size is ⇢ 4 ln(2 | C | / β ) , 2 ln(2 | C | / β ) � n � max α 2 εα - C mechanism with B Use exp with ult.ci/---LxT- . heap ( x ) has by . prob R2 output I - With ? ' c , - Lyla ) EE ' ) Etz ' ) . By unit . convergence , we prob z t - Bf , LD , etc ↳ ( C 11

  17. ↳ Putting things together = th ; sample at prof C - ↳ ( c ' ) MIX , c.) c' e Du ; = to etpl - en Lik 't ) prop . stop - Lik 't Optfx ) - Max c' c- C - O Lx ( c ' ) - min - = C ' c- C - E ) IPI ↳ ( MH ) ) z E) = plulx , UK ) ) EOPTKI u z 4lntp ) if . 14 spy - Edm E e Ed . . ( UCH ) thrill CH ) -172 + by YwYfg - TE w/ prob I - III ; ) ? ' - B drink =L upset 12 . .

Recommend


More recommend