Certified Robustness to Adversarial Examples with Di ff erential - PowerPoint PPT Presentation

Certified Robustness to Adversarial Examples with Di ff erential Privacy Mathias Lécuyer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu, Suman Jana Columbia University Code: https://github.com/columbia/pixeldp   Contact: mathias@cs.columbia.edu

Deep Learning • Deep Neural Networks (DNNs) deliver remarkable performance on many tasks. • DNNs are increasingly deployed, including in attack-prone contexts: Taylor Swift Said to Use Facial Recognition to Identify Stalkers By Sopan Deb, Natasha Singer - Dec. 13, 2018 � 2

Example softmax 1.0 input layer layer layer x 1 2 3 0.5 ticket 1 0.1 … ticket 2 … 0.2 … ticket 3 0.1 no ticket 0.6 no ticket ticket 3 ticket 1 ticket 2 � 3

Example But DNNs are vulnerable to adversarial example attacks. softmax 1.0 input layer layer layer x 1 2 3 0.5 0.1 … … 0.2 … argmax 0.1 0.6 no ticket no ticket ticket 3 ticket 1 ticket 2 � 4

Example But DNNs are vulnerable to adversarial example attacks. + softmax 1.0 input layer layer layer x 1 2 3 0.5 0.1 0.1 … … 0.2 0.7 … argmax 0.1 0.1 0.1 0.6 no ticket ticket 2 no ticket ticket 3 ticket 1 ticket 2 � 5

Accuracy under attack Inception-v3 DNN on ImageNet dataset. Giant panda 1 || α || = 1.06 || α || = 0.52 2 2 0.75 Accuracy (top 1) 0.5 Teddy bear Teapot 0.25 0 0 0.5 1 1.5 2 2.5 3 Size of attack α (2-norm) � 6

Best-e ff ort approaches 1. Evaluate accuracy under attack: • Launch an attack on examples in a test set. • Compute accuracy on the attacked examples. 2. Improve accuracy under attack: • Many approaches: e.g. train on adversarial examples. (e.g Goodfellow+ '15; Papernot+ '16; Buckman+ '18; Guo+ '18) Problem: both steps are attack specific, leading to an arms race that attackers are winning. (e.g Carlini-Wagner '17; Athalye+ '18) 7

Key questions • Guaranteed accuracy: what is my minimum accuracy under any attack? • Prediction robustness: given a prediction can any attack change it? 8

Key questions • Guaranteed accuracy: what is my minimum accuracy under any attack? • Prediction robustness: given a prediction can any attack change it? • A few recent approaches with provable guarantees. (e.g. Wong-Kolter '18; Raghunathan+ '18; Wang+ '18) • Poor scalability in terms of: • Input dimension (e.g. number of pixels). • DNN size. • Size of training data. 9

Key questions • Guaranteed accuracy: what is my minimum accuracy under any attack? • Prediction robustness: given a prediction can any attack change it? • My defense PixelDP gives answers for norm bounded attacks. • Key idea: novel use of differential privacy theory at prediction time. • The most scalable approach: first provable guarantees for large models on ImageNet! 10

PixelDP outline Motivation Design Evaluation 11

Key idea • Problem: small input perturbations create large score changes. = 2 + softmax 1.0 input layer layer layer x 1 2 3 0.5 0.1 0.1 … … 0.7 0.6 … argmax 0.1 0.1 0.1 0.2 ticket 2 no ticket ticket 3 ticket 1 ticket 2 � 12

Key idea • Problem: small input perturbations create large score changes. • Idea: design a DNN with bounded maximum score changes   (leveraging Differential Privacy theory). = 2 + softmax 1.0 input layer layer layer x 1 2 3 0.5 0.1 0.1 … … 0.6 0.7 … argmax 0.1 0.1 0.1 0.2 ticket 2 no ticket ticket 3 ticket 1 ticket 2 � 13

Di ff erential Privacy • Differential Privacy (DP): technique to randomize a computation over a database, such that changing one data point can only lead to bounded changes in the distribution over possible outputs. • For ( ε , δ )-DP randomized computation A f : ≤ P ( A f ( d ) ∈ S ) ≤ e ✏ P ( A f ( d 0 ) ∈ S ) + δ • We prove the Expected Output Stability Bound. For any DP mechanism with bounded outputs in [0, 1] we have: ( A f ( d ) ( A f ( d 0 ) � 14

Key idea • Problem: small input perturbations create large score changes. • Idea: design a DNN with bounded maximum score changes   (leveraging Differential Privacy theory). Make prediction DP softmax 1.0 input layer layer layer x 1 2 3 0.5 0.1 … … 0.2 … argmax 0.1 0.6 no ticket no ticket ticket 3 ticket 1 ticket 2 � 15

Key idea • Problem: small input perturbations create large score changes. • Idea: design a DNN with bounded maximum score changes   (leveraging Differential Privacy theory). stability bounds Make prediction DP softmax 1.0 input layer layer layer x 1 2 3 0.5 0.1 0.1 … … 0.2 0.2 … argmax 0.1 0.1 0.6 0.6 stalker 2 no ticket ticket 3 ticket 1 ticket 2 � 16

Key idea • Problem: small input perturbations create large score changes. • Idea: design a DNN with bounded maximum score changes   (leveraging Differential Privacy theory). stability bounds Make prediction DP softmax 1.0 input layer layer layer x 1 2 3 0.5 0.1 0.1 … … 0.2 0.2 … argmax 0.1 0.1 0.6 0.6 stalker 2 no ticket ticket 3 ticket 1 ticket 2 � 17

PixelDP architecture 1. Add a new noise layer to make DNN DP . 2. Estimate the DP DNN's mean scores. 3. Add estimation error in the stability bounds. � 18

PixelDP architecture softmax input layer layer layer x 1 2 3 0.2 … … 0.1 … + 0.1 0.6 noise layer 1. Add a new noise layer to make DNN DP . 2. Estimate the DP DNN's mean scores. 3. Add estimation error in the stability bounds. � 19

… 3. Add estimation error in the stability bounds. 2. Estimate the DP DNN's mean scores. 1. Add a new noise layer to make DNN DP ( input x ( ε , δ )-DP … layer 1 noise layer + … layer 2 PixelDP architecture layer 3 softmax 0.6 0.1 0.1 0.2 � 20 .

output of an ( ε , δ )-DP mechanism is still ( ε , δ )-DP Resilience to post-processing : any computation on the … 3. Add estimation error in the stability bounds. 2. Estimate the DP DNN's mean scores. 1. Add a new noise layer to make DNN DP input x … layer 1 noise layer + … layer ( 2 PixelDP architecture layer 3 softmax 0.6 0.1 0.1 0.2 � 21 . .

PixelDP architecture softmax ^ input layer layer layer x 1 2 3 0.1 0.2 Compute empirical mean with ? … … 0.2 0.1 … + standard Monte Carlo estimate. 0.1 0.1 0.6 0.6 noise layer 1. Add a new noise layer to make DNN DP . 2. Estimate the DP DNN's mean scores. 3. Add estimation error in the stability bounds. � 22

PixelDP architecture η -confidence intervals stability bounds softmax ^ input layer layer layer ^ x 1 2 3 0.1 0.2 1.0 … … 0.2 0.1 … + 0.1 0.1 0.6 0.6 0.5 noise layer stalker 3 harmless stalker 1 stalker 2 1. Add a new noise layer to make DNN DP . 2. Estimate the DP DNN's mean scores. 3. Add estimation error in the stability bounds. � 23

PixelDP architecture η -confidence intervals stability bounds softmax ^ input layer layer layer ^ x 1 2 3 0.1 0.2 1.0 … … 0.2 0.1 … + 0.1 0.1 0.6 0.6 0.5 noise layer stalker 3 harmless stalker 1 stalker 2 1. Add a new noise layer to make DNN DP . 2. Estimate the DP DNN's mean scores. 3. Add estimation error in the stability bounds. � 24

Further challenges • Train DP DNN with noise. • Control pre-noise sensitivity during training. • Support various attack norms ( ). 0 , L 1 , L 2 , L 1 • Scale to large DNNs and datasets. � 25

Scaling to Inception on ImageNet • Large dataset: image resolution is 300x300x3. • Large model: • 48 layers deep. • 23 millions parameters. • Released pre-trained by Google on ImageNet. Inception-v3 � 26

Scaling to Inception on ImageNet PixelDP auto-encoder input input x x … … … … + noise layer 5 � 27

PixelDP auto-encoder … … + Scaling to Inception on ImageNet ( … … Inception-v3 � 28 Post-processing

PixelDP Outline Motivation Design Evaluation 29

Evaluation: 1. Guaranteed accuracy on large DNNs/datasets 2. Are robust predictions harder to attack in practice? 3. Comparison with other defenses against state-of-the- art attacks. � 30

Methodology Five datasets: Three models: Number of Number of   Number of Dataset Image size Dataset Classes Layers Parameters ImageNet 299x299x3 1000 Inception-v3 48 23M CIFAR-100 32x32x3 100 Wide ResNet 28 36M CIFAR-10 32x32x3 10 CNN 3 3M SVHN 32x32x3 10 MNIST 28x28x1 10 Metrics: Attack methodology: • Guaranteed accuracy. • State of the art attack [Carlini and Wagner S&P'17]. • Accuracy under attack. • Strengthened against our defense by averaging gradients over multiple noise draws. � 31

Guaranteed accuracy on ImageNet with Inception-v3 Accuracy Guaranteed accuracy (%) Model (%) 0.05 0.1 0.2 Baseline 78 - - - PixelDP: L=0.25 68 63 0 0 More DP noise PixelDP: L=0.75 58 53 49 40 Meaningful guaranteed accuracy for ImageNet! � 32

Accuracy on robust predictions Baseline Precision: threshold 0.05 Recall: threshold 0.05 1 0.9 Accuracy (top 1) 0.8 0.7 Dataset: CIFAR-10 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Attack size (2-norm) What if we only act on robust predictions?   (e.g. if not robust, check ticket) � 33

Certified Robustness to Adversarial Examples with Di ff erential - PowerPoint PPT Presentation

Certified Robustness to Adversarial Examples with Di ff erential Privacy Mathias Lcuyer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu, Suman Jana Columbia University Code: https://github.com/columbia/pixeldp Contact:

Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Certified Adversarial Robustness via Randomized Smoothing Jeremy Cohen Elan Rosenfeld

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness? Robustness ? Robustness?

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Adversarial Robustness for Code Pavol Bielik , Martin Vechev pavol.bielik@inf.ethz.ch,

Adversarial Robustness of Machine Learning Models for Graphs Prof. Dr. Stephan Gnnemann

Adversarial Domain Adaptation and Adversarial Robustness Judy Hoffman + = Big Deep success

Adversarial Approaches to Bayesian Learning and Bayesian Approaches to Adversarial Robustness

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

INFORMATION & COMPUTATION Inbal Talgam-Cohen Hebrew University, Tel-Aviv University

A Fast, Robust Network Flow-based Standard-Cell Legalization Method for Minimizing Maximum

On Robust Trimming of Bayesian Network Classifiers YooJung Choi and Guy Van den Broeck UCLA

Making Polynomials Robust to Noise Alexander Sherstov U C L A Noise in computation 2 Noise in

Robustness in real-time systems Nicolas Markey LSV, CNRS & ENS Cachan, France SIES11

Using a Robust Metadata Management System to Accelerate Scientific Discovery at Extreme Scales

Migration: Trying to make it more robust Red Hat Juan Quintela KVM Forum 2014 D usseldorf

Robust control of timed systems Patricia Bouyer-Decitre LSV, CNRS & ENS Cachan, France Based

Sambuz

Useful Links

Newsletter

Mail Us

Certified Robustness to Adversarial Examples with Di ff erential - PowerPoint PPT Presentation

Certified Robustness to Adversarial Examples with Di ff erential Privacy Mathias Lcuyer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu, Suman Jana Columbia University Code: https://github.com/columbia/pixeldp Contact:

Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Certified Adversarial Robustness via Randomized Smoothing Jeremy Cohen Elan Rosenfeld

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &amp;

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness? Robustness ? Robustness?

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Adversarial Robustness for Code Pavol Bielik , Martin Vechev pavol.bielik@inf.ethz.ch,

Adversarial Robustness of Machine Learning Models for Graphs Prof. Dr. Stephan Gnnemann

Adversarial Domain Adaptation and Adversarial Robustness Judy Hoffman + = Big Deep success

Adversarial Approaches to Bayesian Learning and Bayesian Approaches to Adversarial Robustness

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

INFORMATION &amp; COMPUTATION Inbal Talgam-Cohen Hebrew University, Tel-Aviv University

A Fast, Robust Network Flow-based Standard-Cell Legalization Method for Minimizing Maximum

On Robust Trimming of Bayesian Network Classifiers YooJung Choi and Guy Van den Broeck UCLA

Making Polynomials Robust to Noise Alexander Sherstov U C L A Noise in computation 2 Noise in

Robustness in real-time systems Nicolas Markey LSV, CNRS &amp; ENS Cachan, France SIES11

Using a Robust Metadata Management System to Accelerate Scientific Discovery at Extreme Scales

Migration: Trying to make it more robust Red Hat Juan Quintela KVM Forum 2014 D usseldorf

Robust control of timed systems Patricia Bouyer-Decitre LSV, CNRS &amp; ENS Cachan, France Based

Sambuz

Useful Links

Newsletter

Mail Us

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &

INFORMATION & COMPUTATION Inbal Talgam-Cohen Hebrew University, Tel-Aviv University

Robustness in real-time systems Nicolas Markey LSV, CNRS & ENS Cachan, France SIES11

Robust control of timed systems Patricia Bouyer-Decitre LSV, CNRS & ENS Cachan, France Based