lecture 21 adversarial networks
play

Lecture 21: Adversarial Networks CS109B Data Science 2 Pavlos - PowerPoint PPT Presentation

Lecture 21: Adversarial Networks CS109B Data Science 2 Pavlos Protopapas and Mark Glickman 1 How vulnerable are Neural Networks? Uses of Neural Networks CS109B, P ROTOPAPAS , G LICKMAN How vulnerable are Neural Networks? CS109B, P ROTOPAPAS ,


  1. Lecture 21: Adversarial Networks CS109B Data Science 2 Pavlos Protopapas and Mark Glickman 1

  2. How vulnerable are Neural Networks? Uses of Neural Networks CS109B, P ROTOPAPAS , G LICKMAN

  3. How vulnerable are Neural Networks? CS109B, P ROTOPAPAS , G LICKMAN

  4. Explaining Adversarial Examples [Goodfellow et. al ‘15] 1. Robust attacks with FGSM 2. Robust defense with Adversarial Training CS109B, P ROTOPAPAS , G LICKMAN

  5. Explaining Adversarial Examples CS109B, P ROTOPAPAS , G LICKMAN

  6. Some of these adversarial examples can even fool humans: CS109B, P ROTOPAPAS , G LICKMAN

  7. Attacking with F ast G radient S ign M ethod (FGSM) W X L x + λ · sign( r x L) ) x ∗ CS109B, P ROTOPAPAS , G LICKMAN

  8. Attacking with F ast G radient S ign M ethod (FGSM) W X L x + λ · sign( r x L) ) x ∗ CS109B, P ROTOPAPAS , G LICKMAN

  9. x + λ · sign( r x L) ) x ∗ CS109B, P ROTOPAPAS , G LICKMAN

  10. Defending with Adversarial Training 1. Generate adversarial examples 2. Adjust labels CS109B, P ROTOPAPAS , G LICKMAN

  11. Defending with Adversarial Training “ Panda ” 1. Generate adversarial examples 2. Adjust labels CS109B, P ROTOPAPAS , G LICKMAN

  12. Defending with Adversarial Training “ Panda ” 1. Generate adversarial examples 2. Adjust labels 3. Add them to the training set 4. Train new network CS109B, P ROTOPAPAS , G LICKMAN

  13. Attack methods post GoodFellow 2015 ● FGSM [Goodfellow et. al ‘15] ● JSMA [Papernot et. al ‘16] ● C&W [Carlini + Wagner ‘16] ● Step-LL [Kurakin et. al ‘17] ● I-FGSM [Tramer et. al ‘18] CS109B, P ROTOPAPAS , G LICKMAN

  14. White box attacks W L x + λ · sign( r x L) ) x ∗ x + λ · r x L ) x ∗ CS109B, P ROTOPAPAS , G LICKMAN

  15. “ Black Box ” Attacks “ Black Box ” Attacks [Papernot et. al ‘17] CS109B, P ROTOPAPAS , G LICKMAN

  16. “ Black Box ” Attacks Examine inputs and outputs of the model CS109B, P ROTOPAPAS , G LICKMAN

  17. “ Black Box ” Attacks Panda CS109B, P ROTOPAPAS , G LICKMAN

  18. “ Black Box ” Attacks Panda Gibbon CS109B, P ROTOPAPAS , G LICKMAN

  19. “ Black Box ” Attacks Panda Gibbon Ostrich CS109B, P ROTOPAPAS , G LICKMAN

  20. “ Black Box ” Attacks Train a model that performs the same as the black box CS109B, P ROTOPAPAS , G LICKMAN

  21. “ Black Box ” Attacks Train a model that performs the same as the black box Panda Gibbon Ostrich CS109B, P ROTOPAPAS , G LICKMAN

  22. “ Black Box ” Attacks Now attack the model you just trained with “ white ” box attack W L x + λ · r x L ) x ∗ x + λ · sign( r x L) ) x ∗ CS109B, P ROTOPAPAS , G LICKMAN

  23. “ Black Box ” Attacks Use those adversarial examples to the “ black ” box CS109B, P ROTOPAPAS , G LICKMAN

  24. CleverHans A Python library to benchmark machine learning systems' vulnerability to adversarial examples. https://github.com/tensorflow/cleverhans http://www.cleverhans.io/ CS109B, P ROTOPAPAS , G LICKMAN

  25. More Defenses Mixup: Smooth decision boundaries: • Mix two training examples • Regularize the derivatives wrt to x Augment training set • x = λ x i + (1 − λ ) x j ˜ y = λ y i + (1 − λ ) y j ˜ P AVLOS P ROTOPAPAS

  26. Physical attacks Object Detection • • Adversarial Stickers CS109B, P ROTOPAPAS , G LICKMAN

  27. Thank you. CS109B, P ROTOPAPAS , G LICKMAN

Recommend


More recommend