Transferable Adversarial Examples: Insights, A9acks & Defenses - PowerPoint PPT Presentation

Transferable Adversarial Examples: Insights, A9acks & Defenses June 12 th 2017 Florian Tramèr Joint work with Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh & Patrick McDaniel

Adversarial Examples Threat Model: White-Box A9acks ground truth ML Model bird Loss tree plane Take gradient of the loss “Fast Gradient Sign Method” (FGSM) r = ✏ · sign ( r x J ( x, y, ✓ )) 2

Adversarial Examples Threat Model: White-Box A9acks ML Model Hypothetical Attacks on Autonomous Vehicles bird + tree plane Denial of service Confusing object k r k ∞ = ✏ Adversarial “Fast Gradient Sign Method” (FGSM) Harm self / passengers input Harm others recognized as r = ✏ · sign ( r x J ( x, y, ✓ )) “navigable road” 3 Adversarial input recognized as “open space on the road”

Adversarial Examples Threat Model: Black-Box A9acks ML Model ML Model plane plane ML Model plane Adversarial Examples transfer 4

The Space of Transferable Adversarial Examples 5

How large is the “space” of adversarial examples? • At least 2D – Warde-Farley & Goodfellow 2016 – Liu et al. 2017 Church window plots. Warde-Farley & Goodfellow 2016 6

Gradient-Aligned Subspaces • Adversarial examples form a con)guous subspace of “high” dimensionality – 15-45 dimensions for DNNs and CNNs on MNIST – IntersecAon of adversarial subspaces is also mul^dimensional 7

Decision Boundary Similarity Distance between boundaries Distance to boundary 8

Decision Boundary Similarity • Experiments with MNIST and DREBIN (malware) – DNN, Logis^c Regression, SVM – 3 direcôns: • Aligned with gradient (adversarial example) • In direcôn of data point of different class • In random direcôn Models are similar “everywhere” • Results: In any direcôn, distance to boundary ≫ distance btw boundaries 9

Open Ques^ons • Why this similarity? – Data dependent results? – E.g., for a binary MNIST task (3s vs 7s) we prove: If F 1 (linear model) and F 2 (quadraAc model) have high accuracy, then there are adversarial examples that transfer between the two models – These adversarial examples also transfer to DNNs and CNNs but we can’t prove this is inherent … 10

Transferability and Adversarial Training 11

Adversarial Training ML Model Loss bird take gradient (FGSM) ML Model Loss plane 12

A9acks on Adversarial Training MNIST ImageNet (top1) 18.2 36.5 20 40 35 Error Rate (%) Error Rate (%) 26.8 15 30 22.0 25 10 20 15 3.6 5 10 1.0 5 0 0 Adversarial examples transferred from another (standard) model 13

Gradient Masking • How to get robustness to FGSM-style a9acks? Large Margin Classifier “Gradient Masking” 14

Loss of Adversarially Trained Model Adversarial Loss Example Non-Adversarial Example Move in direc^on of Move in direc^on another model’s gradient of model’s gradient (black-box a9ack) (white-box a9ack) Data Point 15

Loss of Adversarially Trained Model Loss 16

Simple One-Shot A9ack: RAND+FGSM MNIST 34.1 40 Error Rate (%) 20 3.6 0 FGSM RAND+FGSM ImageNet (top1) 80 64.3 Error Rate (%) 60 40 26.8 20 1. Small random step 0 FGSM RAND+FGSM 2. Step in direc^on of gradient 17

FGSM vs RAND+FGSM • An improved one-shot a9ack even against non-defended models: ≈ + 4% error on MNIST ≈ + 11% error on ImageNet • Adversarial training with RAND+FGSM – Doesn’t work … – Are we stuck with adversarial training? 18

What’s Wrong with Adversarial Training? • Minimize loss( x, y ) + loss( x + ✏ · sign (grad) , y ) Small if: 1. The model is actually robust 2. Or, the gradient points in a direcAon that is not adversarial Degenerate Minimum 19

Ensemble Adversarial Training • How do we avoid these degenerate minima? pre-trained ML Model ML Model ML Model Loss 20

Results Source model for a9ack was not used MNIST (standard CNN) during training 18 15.5 Adv. Training Ensemble Adv. Training 16 14 Less white-box FGSM 12 Error Rate samples seen during 10 training 8 6.0 6 3.9 3.8 4 2 0.7 0.7 0 Clean Data White-Box FGSM A9ack Black-Box FGSM A9ack 21

Results ImageNet (Incep)on v3, Incep)on ResNet v2) Adv. Training Ensemble Adv. Training Ensemble Adv. Training (ResNet) 40 36.5 35 30.4 30.0 26.8 30 25.9 24.6 23.6 Error Rate 22.0 25 20.2 20 15 10 5 0 Clean Data White-Box FGSM A9ack Black-Box FGSM A9ack 22

What about stronger a9acks? • Li9le to no improvement on white-box itera^ve and RAND+FGSM a9acks! • But, improvements in black-box seMng ! Black-Box APacks on MNIST Adv. Training Ensemble Adv. Training 20.0 15.5 15.2 ≈ ≈ Error Rate 13.5 9.5 7.0 6.2 10.0 3.9 2.9 0.0 FGSM Carlini-Wagner I-FGSM RAND+FGSM 23

What about stronger a9acks? Black-Box APacks on ImageNet Adv. Training Ensemble Adv. Training Ensemble Adv. Training (ResNet) 36.5 40.0 35.0 30.8 30.4 29.9 30.0 25.0 24.6 Error Rate 25.0 20.0 15.0 10.0 5.0 0.0 FGSM RAND+FGSM 24

Prac^cal Considera^ons for Ensemble Adversarial Training • Pre-compute gradients for pre-trained models – Lower per-batch cost than with adversarial training! • Randomize source model in each batch – If num_models % num_batches = 0 , we see the same adversarial examples in each epoch if we just rotate Maybe because • Convergence is slower the task is Standard Incep^on v3: ~150 epochs actually hard?... Adversarial training: ~190 epochs Ensemble adversarial training: ~280 epochs 25

Takeaways • Test defenses on black-box a9acks – Dis^lla^on (Papernot et al. 2016, a9ack by Carlini et al. 2016) – Biologically Inspired Networks (Nayebi & Ganguli 2017, a9ack by Brendel & Bethge 2017) – Adversarial Training, and probably many others … • « If you don’t know where to go, just move at random. » — Morgan Freeman — (or Dan Boneh) � • Ensemble Adversarial Training can improve robustness to black-box a9acks 26

Open Problems • Be9er black-box a9acks? – Using ensemble of source models? (Lin et al. 2017) – How much does oracle access to the model help? • More efficient ensemble adversarial training? • Can we say anything formal (and useful) about adversarial examples? THANK YOU 27

Transferable Adversarial Examples: Insights, A9acks & Defenses - PowerPoint PPT Presentation

Transferable Adversarial Examples: Insights, A9acks & Defenses June 12 th 2017 Florian Tramr Joint work with Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh & Patrick McDaniel Adversarial Examples Threat Model: White-Box

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Transferable Utility Game Theory Course: Jackson, Leyton-Brown & Shoham Game Theory Course:

Internet Outbreaks: Internet Outbreaks: Epidemiology and Defenses Epidemiology and Defenses

The case for dynamic defenses against adversarial examples Ian Goodfellow SafeML ICLR Workshop

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Identifying and Showcasing Your Transferable Skills Maggie Evans, Ph.D. July 12, 2018 Learning

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

CVPR 2020 Universal Adversarial Attacks Image agnostic and transferable across networks

On Adaptive Attacks to Adversarial Example Defenses Florian Tramr USENIX ScAINet August 10 th

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

A Closer Look at Adversarial Examples for Separated Data Kamalika Chaudhuri University of

Adversarial Examples Hanxiao Liu April 2, 2018 1 / 22 Adversarial Examples Inputs to ML

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Efficient Defenses Against Adversarial Examples for Deep Neural Networks Valentina Zantedeschi

Ranking-Based Voting How to Describe . . . Revisited: Maximum Utility-Based Decision . . . How

The CS Model Becker introduced transferable utility model of marriage market 30 years ago.

Learning Transferable Graph Exploration Hanjun Dai, Yujia Li, Chenglong Wang, Rishabh Singh,

CREATIVE CODING what why how how how how _will_chase _philly_dataviz_meetup 2019-10-29

University of Greenwich Why choose the University of Greenwich ? We have three beautiful and

Who Participates in Risk Transfer Markets? The Role of Transaction Costs and Counterparty Risk

SIP Call Control : Transfer Robert.Sparks@wcom.com 47th IETF draft-sparks-sip-cc-transfer-00 1

Utilizing iALC to Formalize the Brazilian OAB Exam Bernardo Alkmim 1 Alexandre Rademaker 2 Edward

Transferable Adversarial Examples: Insights, A9acks & Defenses - PowerPoint PPT Presentation

Transferable Adversarial Examples: Insights, A9acks & Defenses June 12 th 2017 Florian Tramr Joint work with Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh & Patrick McDaniel Adversarial Examples Threat Model: White-Box

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Transferable Utility Game Theory Course: Jackson, Leyton-Brown &amp; Shoham Game Theory Course:

Internet Outbreaks: Internet Outbreaks: Epidemiology and Defenses Epidemiology and Defenses

The case for dynamic defenses against adversarial examples Ian Goodfellow SafeML ICLR Workshop

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

Identifying and Showcasing Your Transferable Skills Maggie Evans, Ph.D. July 12, 2018 Learning

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

CVPR 2020 Universal Adversarial Attacks Image agnostic and transferable across networks

On Adaptive Attacks to Adversarial Example Defenses Florian Tramr USENIX ScAINet August 10 th

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

A Closer Look at Adversarial Examples for Separated Data Kamalika Chaudhuri University of

Adversarial Examples Hanxiao Liu April 2, 2018 1 / 22 Adversarial Examples Inputs to ML

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

Efficient Defenses Against Adversarial Examples for Deep Neural Networks Valentina Zantedeschi

Ranking-Based Voting How to Describe . . . Revisited: Maximum Utility-Based Decision . . . How

The CS Model Becker introduced transferable utility model of marriage market 30 years ago.

Learning Transferable Graph Exploration Hanjun Dai, Yujia Li, Chenglong Wang, Rishabh Singh,

CREATIVE CODING what why how how how how _will_chase _philly_dataviz_meetup 2019-10-29

University of Greenwich Why choose the University of Greenwich ? We have three beautiful and

Who Participates in Risk Transfer Markets? The Role of Transaction Costs and Counterparty Risk

SIP Call Control : Transfer Robert.Sparks@wcom.com 47th IETF draft-sparks-sip-cc-transfer-00 1

Utilizing iALC to Formalize the Brazilian OAB Exam Bernardo Alkmim 1 Alexandre Rademaker 2 Edward

Transferable Utility Game Theory Course: Jackson, Leyton-Brown & Shoham Game Theory Course:

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin