security and privacy of machine learning
play

SECURITY AND PRIVACY OF MACHINE LEARNING Ian Goodfellow Staff - PowerPoint PPT Presentation

#RSAC SESSION ID: SECURITY AND PRIVACY OF MACHINE LEARNING Ian Goodfellow Staff Research Scientist Google Brain @goodfellow_ian Machine Learning and Security #RSAC Machine Learning for Security Security against Machine Learning h 1 h 1 y


  1. #RSAC SESSION ID: SECURITY AND PRIVACY OF MACHINE LEARNING Ian Goodfellow Staff Research Scientist Google Brain @goodfellow_ian

  2. Machine Learning and Security #RSAC Machine Learning for Security Security against Machine Learning h 1 h 1 y y x 1 x 1 h 2 h 2 Password guessing Malware detection x 2 x 2 Fake reviews Intrusion detection h 1 h 1 y y … … x 1 x 1 h 2 h 2 x 2 x 2 (Goodfellow 2018) 2

  3. Security of Machine Learning #RSAC h 1 h 1 y y x 1 x 1 h 2 h 2 x 2 x 2 (Goodfellow 2018) 3

  4. An overview of a field #RSAC This presentation summarizes the work of many people, not just my own / my collaborators Download the slides for this link to extensive references The presentation focuses on the concepts , not the history or the inventors (Goodfellow 2018) 4

  5. Machine Learning Pipeline #RSAC Training data Learned parameters Learning algorithm X θ x ˆ y Test output Test input (Goodfellow 2018) 5

  6. Privacy of Training Data #RSAC ˆ θ X X (Goodfellow 2018) 6

  7. Defining ( ε , δ )-Di ff erential Privacy #RSAC (Abadi 2017) (Goodfellow 2018) 7

  8. Private Aggregation of Teacher Ensembles #RSAC (Papernot et al 2016) (Goodfellow 2018) 8

  9. Training Set Poisoning #RSAC θ ˆ X y x (Goodfellow 2018) 9

  10. ImageNet Poisoning #RSAC (Koh and Liang 2017) (Goodfellow 2018) 10

  11. Adversarial Examples #RSAC X θ ˆ y x (Goodfellow 2018) 11

  12. Model Theft #RSAC X θ x ˆ y ˆ θ (Goodfellow 2018) 12

  13. Model Theft++ #RSAC ˆ X θ X x ˆ y ˆ θ x (Goodfellow 2018) 13

  14. Deep Dive on Adversarial Examples #RSAC Since 2013, deep neural networks have matched human performance at... ...recognizing objects and faces…. (Szegedy et al, 2014) (Taigmen et al, 2013) ...solving CAPTCHAS and reading addresses... (Goodfellow et al, 2013) (Goodfellow et al, 2013) and other tasks... (Goodfellow 2018) 14

  15. Adversarial Examples #RSAC (Goodfellow 2018) 15

  16. Turning objects into airplanes #RSAC (Goodfellow 2018) 16

  17. Attacking a linear model #RSAC (Goodfellow 2018) 17

  18. Wrong almost everywhere #RSAC (Goodfellow 2018) 18

  19. Cross-model, cross-dataset transfer #RSAC (Goodfellow 2018) 19

  20. Transfer across learning algorithms #RSAC (Papernot 2016) (Goodfellow 2018) 20

  21. Transfer attack #RSAC Target model with unknown weights, Substitute model Train your machine learning mimicking target own model algorithm, training model with known, set; maybe non- di ff erentiable function di ff erentiable Adversarial crafting Deploy adversarial against substitute examples against the Adversarial target; transferability examples property results in them succeeding (Goodfellow 2018) 21

  22. Enhancing Transfer with Ensembles #RSAC (Liu et al, 2016) (Goodfellow 2018) 22

  23. Transfer to the Human Brain #RSAC (Elsayed et al, 2018) (Goodfellow 2018) 23

  24. Transfer to the Physical World #RSAC (Kurakin et al, 2016) (Goodfellow 2018) 24

  25. Adversarial Training #RSAC 10 0 Train=Clean, Test=Clean Test misclassification rate Train=Clean, Test=Adv Train=Adv, Test=Clean 10 − 1 Train=Adv, Test=Adv 10 − 2 0 50 100 150 200 250 300 Training time (epochs) (Goodfellow 2018) 25

  26. Adversarial Training vs Certified Defenses #RSAC Adversarial Training: Train on adversarial examples This minimizes a lower bound on the true worst-case error Achieves a high amount of (empirically tested) robustness on small to medium datasets Certified defenses Minimize an upper bound on true worst-case error Robustness is guaranteed, but amount of robustness is small Verification of models that weren’t trained to be easy to verify is hard (Goodfellow 2018) 26

  27. Limitations of defenses #RSAC Even certified defenses so far assume unrealistic threat model Typical model: attacker can change input within some norm ball Real attacks will be stranger, hard to characterize ahead of time (Brown et al., 2017) (Goodfellow 2018) 27

  28. Clever Hans #RSAC (“Clever Hans, Clever Algorithms,” Bob Sturm) (Goodfellow 2018) 28

  29. Get involved! #RSAC https://github.com/tensorflow/cleverhans (Goodfellow 2018) 29

  30. Apply What You Have Learned #RSAC Publishing an ML model or a prediction API? Is the training data sensitive? -> train with differential privacy Consider how an attacker could cause damage by fooling your model Current defenses are not practical Rely on situations with no incentive to cause harm / limited amount of potential harm (Goodfellow 2018) 30

Recommend


More recommend