Adversarial Training and Robustness for Multiple Perturbations - PowerPoint PPT Presentation

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramèr & Dan Boneh NeurIPS 2019

Adversarial examples 88% Tabby Cat 99% Guacamole Szegedy et al., 2014 Goodfellow et al., 2015 Athalye, 2017 Adversarial Training and Robustness for Multiple Perturbations

Adversarial examples 88% Tabby Cat 99% Guacamole • ML models learn very different features than humans • This is a safety concern for deployed ML models • Classification in adversarial settings is hard Szegedy et al., 2014 Goodfellow et al., 2015 Athalye, 2017 Adversarial Training and Robustness for Multiple Perturbations

Adversarial training Szegedy et al., 2014 Madry et al., 2017 Adversarial Training and Robustness for Multiple Perturbations

Adversarial training 1. Choose a set of perturbations: e.g., noise of small ℓ ∞ norm: Szegedy et al., 2014 Madry et al., 2017 Adversarial Training and Robustness for Multiple Perturbations

Adversarial training 1. Choose a set of perturbations: e.g., noise of small ℓ ∞ norm: 2. For each example , find an adversarial example: 3. Train the model on 4. Repeat until convergence Szegedy et al., 2014 Madry et al., 2017 Adversarial Training and Robustness for Multiple Perturbations

How well does it work? ℓ 1 noise Rotation Engstrom et al., 2017 Sharma & Chen, 2018 Adversarial Training and Robustness for Multiple Perturbations

� � How well does it work? Adversarial training on CIFAR10, with ℓ ∞ noise 96% accuracy 70% 16% 9% No noise ℓ 1 noise ℓ ∞ noise Rotation Engstrom et al., 2017 Sharma & Chen, 2018 Adversarial Training and Robustness for Multiple Perturbations

� � � � How well does it work? Adversarial training on CIFAR10, with ℓ ∞ noise 96% accuracy 70% 16% 9% No noise ℓ 1 noise ℓ ∞ noise Rotation Engstrom et al., 2017 Sharma & Chen, 2018 Adversarial Training and Robustness for Multiple Perturbations

How to prevent other adversarial examples? Adversarial Training and Robustness for Multiple Perturbations

� � � How to prevent other adversarial examples? S 2 = { δ : ❘❘ δ ❘❘ 1 ≤ ε 1 } S 1 = { δ : ❘❘ δ ❘❘ ∞ ≤ ε ∞ } S 3 = { 𝜀 : « small rotation » } Adversary can choose a perturbation type for each input Adversarial Training and Robustness for Multiple Perturbations

� � � How to prevent other adversarial examples? S 2 = { δ : ❘❘ δ ❘❘ 1 ≤ ε 1 } S 1 = { δ : ❘❘ δ ❘❘ ∞ ≤ ε ∞ } S 3 = { 𝜀 : « small rotation » } Adversary can choose a perturbation S = S 1 ⋃ S 2 ⋃ S 3 type for each input • Pick worst-case adversarial example from S • Train the model on that example Adversarial Training and Robustness for Multiple Perturbations

Does this work? Adversarial Training and Robustness for Multiple Perturbations

Does this work? A robustness tradeoff is provably inherent in some classification tasks Increased robustness to one type of noise ⇒ decreased robustness to another Empirically validated on CIFAR10 & MNIST Adversarial Training and Robustness for Multiple Perturbations

Does this work? A robustness tradeoff is provably inherent in some classification tasks Increased robustness to one type of noise ⇒ decreased robustness to another Empirically validated on CIFAR10 & MNIST MNIST: Adversarial Training and Robustness for Multiple Perturbations

Does this work? A robustness tradeoff is provably inherent in some classification tasks Increased robustness to one type of noise ⇒ decreased robustness to another Empirically validated on CIFAR10 & MNIST For ℓ ∞ , ℓ 1 and ℓ 2 noise: MNIST: 50% accuracy Adversarial Training and Robustness for Multiple Perturbations

Does this work? A robustness tradeoff is provably inherent in some classification tasks Increased robustness to one type of noise ⇒ decreased robustness to another Empirically validated on CIFAR10 & MNIST For ℓ ∞ , ℓ 1 and ℓ 2 noise: gradient MNIST: masking 50% accuracy Adversarial Training and Robustness for Multiple Perturbations

What if we combine perturbations? Adversarial Training and Robustness for Multiple Perturbations

� � � � What if we combine perturbations? natural image rotation ℓ ∞ noise ½ rotation + ½ ℓ ∞ noise Adversarial Training and Robustness for Multiple Perturbations

� � � � What if we combine perturbations? natural image rotation ℓ ∞ noise ½ rotation + ½ ℓ ∞ noise 96% Accuracy 70% 65% 55% One noise One of two Mixture of two No noise type noise types noise types Adversarial Training and Robustness for Multiple Perturbations

Conclusion Adversarial training for multiple perturbation sets works, but... • Significant loss in robustness • Weak robustness to affine combinations of perturbations Poster #87 https://arxiv.org/abs/1904.13000 Adversarial Training and Robustness for Multiple Perturbations

Conclusion Adversarial training for multiple perturbation sets works, but... • Significant loss in robustness • Weak robustness to affine combinations of perturbations Open questions: Poster #87 Train a single MNIST model with high robustness to any ℓ p noise • • Better scaling of multi-perturbation adversarial training • Which perturbations do we care about? https://arxiv.org/abs/1904.13000 Adversarial Training and Robustness for Multiple Perturbations

Adversarial Training and Robustness for Multiple Perturbations - PowerPoint PPT Presentation

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr & Dan Boneh NeurIPS 2019 Adversarial examples 88% Tabby Cat 99% Guacamole Szegedy et al., 2014 Goodfellow et al., 2015 Athalye, 2017 Adversarial

Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness? Robustness ? Robustness?

Adversarial Robustness for Code Pavol Bielik , Martin Vechev pavol.bielik@inf.ethz.ch,

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Adversarial Robustness of Machine Learning Models for Graphs Prof. Dr. Stephan Gnnemann

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Adversarial Domain Adaptation and Adversarial Robustness Judy Hoffman + = Big Deep success

Adversarial Approaches to Bayesian Learning and Bayesian Approaches to Adversarial Robustness

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Robustness and Generalization Huan Xu The University of Texas at Austin Department of Electrical

A-NICE-MC Jiaming Song 1. Motivation 2. Notations and Problem Setup 3. Adversarial Training for

On the (In-)Security of Machine Learning Nicholas Carlini Google Brain Written: Sept 24, 2014

Finite-time Blowup of Semilinear PDEs via the Feynman-Kac Representation E A LFREDO L J OS

T he E gyptian IPv6 T ask F or c e E xpe r ie nc e Hisham Ahme d Ibr ahim Ministry of

Corso di Motori Aeronautici Mauro Valorani Laurea Magistrale in Ingegneria Aeronautica (MAER)

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Update on NIIFI's storage and cloud related activities TF-Storage Meeting September 27, 2012

of Graph Embeddings Aleksandar Bojchevski Technical University of Munich, Germany Graph

Messers Jarrod Miller and Neil Aitken 2014 This AGM reviews 2014 using the slide template made

Adversarial Training and Robustness for Multiple Perturbations - PowerPoint PPT Presentation

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr & Dan Boneh NeurIPS 2019 Adversarial examples 88% Tabby Cat 99% Guacamole Szegedy et al., 2014 Goodfellow et al., 2015 Athalye, 2017 Adversarial

Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness? Robustness ? Robustness?

Adversarial Robustness for Code Pavol Bielik , Martin Vechev pavol.bielik@inf.ethz.ch,

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Adversarial Robustness of Machine Learning Models for Graphs Prof. Dr. Stephan Gnnemann

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Adversarial Domain Adaptation and Adversarial Robustness Judy Hoffman + = Big Deep success

Adversarial Approaches to Bayesian Learning and Bayesian Approaches to Adversarial Robustness

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Robustness and Generalization Huan Xu The University of Texas at Austin Department of Electrical

A-NICE-MC Jiaming Song 1. Motivation 2. Notations and Problem Setup 3. Adversarial Training for

On the (In-)Security of Machine Learning Nicholas Carlini Google Brain Written: Sept 24, 2014

Finite-time Blowup of Semilinear PDEs via the Feynman-Kac Representation E A LFREDO L J OS

T he E gyptian IPv6 T ask F or c e E xpe r ie nc e Hisham Ahme d Ibr ahim Ministry of

Corso di Motori Aeronautici Mauro Valorani Laurea Magistrale in Ingegneria Aeronautica (MAER)

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

Update on NIIFI's storage and cloud related activities TF-Storage Meeting September 27, 2012

of Graph Embeddings Aleksandar Bojchevski Technical University of Munich, Germany Graph

Messers Jarrod Miller and Neil Aitken 2014 This AGM reviews 2014 using the slide template made

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin