Distillation as a Defense to Adversarial Perturbations against Deep - PowerPoint PPT Presentation

Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks Nicolas Papernot , Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami May 24th, 2016 @ 37th IEEE Symposium on Security and Privacy @NicolasPapernot 1

M components N components p0=0.01 … p1=0.93 “Type a quote here.” … … … p8=0.02 { pN=0.01 –Johnny Appleseed Input Layer Hidden Layers Output Layer (e.g., convolutional, rectified linear, …) Neuron Weighted Link (weight is a parameter part of ) θ O 2

M components N components p0=0.01 … p1=0.02 “Type a quote here.” … … … p8=0.89 { pN=0.01 –Johnny Appleseed Input Layer Hidden Layers Output Layer (e.g., convolutional, rectified linear, …) Neuron Weighted Link (weight is a parameter part of ) θ O 3

Deep Learning for Classification 5

M components N components … … … { Hidden Layers Input Layer Output Layer (e.g., convolutional, rectified linear, …) Neuron Weighted Link (weight is a parameter part of ) θ O 6

M components N components p0=0.01 p1=0.93 … … … p8=0.02 pN=0.01 { Hidden Layers Input Layer Output Layer (e.g., convolutional, rectified linear, …) Neuron Weighted Link (weight is a parameter part of ) θ O 13

Language Model Meaning Decision Trees Sentence Word Feature Extraction Phoneme NLP State Frame Lexicon Audio Acoustic Model Source: Tara N. Sainath, Google @ ICML DL Workshop 2015 14

Adversarial Samples 15

Output classification 0 1 2 3 4 5 6 7 8 9 CIFAR10 Dataset 9 8 7 6 5 4 3 2 1 0 truck automobile bird airplane bird Input class 16

Adversarial strategy 17

Defending against Adversarial Perturbations 18

DNN Robustness 19

Defense Design • Low impact on the architecture • Maintain accuracy • Robust in space relatively close to the legitimate distribution • Maintain speed of network 20

Softmax Layer and Probabilities 21

Defensive Distillation 22

Defensive Distillation Set temperature T=1 for predictions 29

Intuition behind Defensive Distillation Constraining Training Reducing Jacobian Amplitudes 0 if i not correct class never equal to 0 30

Validation 31

Experimental Setup 32

Adversarial Samples Success Rate (MNIST) Adversarial Samples Baseline Rate (MNIST) Adversarial Samples Success Rate (CIFAR10) Adversarial Samples Baseline Rate (CIFAR10) 100 90 Adversarial Sample Success Rate 80 70 60 50 40 30 20 10 0 1 10 100 Distillation Temperature 33

Impact on accuracy 34

Impact on Jacobian Amplitude 35

Estimation of Robustness 36

Conclusions 37

Take aways • Distillation significantly reduces attack success • Yields model smoothness • Easy implementation, low overhead • Acceptable impact on accuracy 38

Questions? https://www.papernot.fr nicolas@papernot.fr @NicolasPapernot

Distillation as a Defense to Adversarial Perturbations against Deep - PowerPoint PPT Presentation

Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks Nicolas Papernot , Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami May 24th, 2016 @ 37th IEEE Symposium on Security and Privacy @NicolasPapernot 1 M

On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models Paul Michel, Xian Li,

Learning Universal Adversarial Perturbations with Generative Models Jamie Hayes & George

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &

Distillation. Optimal operation using simple control structures Sigurd Skogestad, NTNU, Trondheim

Complex distillation systems. Theory and models. Pio Aguirre INGAR Santa Fe-Argentina Outline

Effective Topic Distillation Effective Topic Distillation with Key Resource Pre- -selection

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations Florian

N formalism for curvature perturbations formalism for curvature perturbations from inflation

P Perturbations Perturbations P t t b ti b ti in Lee in Lee in Lee Wick Bouncing Universe

Stochastic Perturbations of Proximal-Gradient methods for nonsmooth convex optimization: the

Measuring Perturbations Measuring Perturbations with Weak Lensing of SNe with Weak Lensing of

Food Defense Food Defense Tabletop Food Defense Food Defense Tabletop Tabletop Tabletop

Defense Against Adversarial Images using Web-Scale Nearest-Neighbor Search Abhimanyu Dubey,

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Linking Emissions Trading Systems: Paving the way toward a low-carbon future Damien MEADOWS

Deep Prolog: End-to-end Differentiable Proving in Knowledge Bases Tim Rockt aschel University

10703 Deep Reinforcement Learning Policy Gradient Methods Tom Mitchell October 1, 2018 Reading:

Thomas Garnier SkyRecon Systems Recon 2008 05/23/2008 Overview Introduction LPC

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Deep-Learning: Unsupervised Generative models Deep Belief Networks Deep Stacked AutoEncoders

Introduction to Deep Models Part I: Classifiers and Generative Networks Nick Winovich Department

Splines and imaging: From compressed sensing to deep neural nets Michael Unser Biomedical

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Distillation as a Defense to Adversarial Perturbations against Deep - PowerPoint PPT Presentation

Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks Nicolas Papernot , Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami May 24th, 2016 @ 37th IEEE Symposium on Security and Privacy @NicolasPapernot 1 M

On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models Paul Michel, Xian Li,

Learning Universal Adversarial Perturbations with Generative Models Jamie Hayes &amp; George

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &amp;

Distillation. Optimal operation using simple control structures Sigurd Skogestad, NTNU, Trondheim

Complex distillation systems. Theory and models. Pio Aguirre INGAR Santa Fe-Argentina Outline

Effective Topic Distillation Effective Topic Distillation with Key Resource Pre- -selection

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations Florian

N formalism for curvature perturbations formalism for curvature perturbations from inflation

P Perturbations Perturbations P t t b ti b ti in Lee in Lee in Lee Wick Bouncing Universe

Stochastic Perturbations of Proximal-Gradient methods for nonsmooth convex optimization: the

Measuring Perturbations Measuring Perturbations with Weak Lensing of SNe with Weak Lensing of

Food Defense Food Defense Tabletop Food Defense Food Defense Tabletop Tabletop Tabletop

Defense Against Adversarial Images using Web-Scale Nearest-Neighbor Search Abhimanyu Dubey,

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Linking Emissions Trading Systems: Paving the way toward a low-carbon future Damien MEADOWS

Deep Prolog: End-to-end Differentiable Proving in Knowledge Bases Tim Rockt aschel University

10703 Deep Reinforcement Learning Policy Gradient Methods Tom Mitchell October 1, 2018 Reading:

Thomas Garnier SkyRecon Systems Recon 2008 05/23/2008 Overview Introduction LPC

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Deep-Learning: Unsupervised Generative models Deep Belief Networks Deep Stacked AutoEncoders

Introduction to Deep Models Part I: Classifiers and Generative Networks Nick Winovich Department

Splines and imaging: From compressed sensing to deep neural nets Michael Unser Biomedical

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Learning Universal Adversarial Perturbations with Generative Models Jamie Hayes & George

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin