Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo - PowerPoint PPT Presentation

Preliminaries on adversarial robustness Classifier-dependent lower bounds Universal lower bounds Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 1 / 41

Preliminaries on adversarial robustness Classifier-dependent lower bounds Universal lower bounds Table of contents Preliminaries on adversarial robustness 1 Classifier-dependent lower bounds 2 Universal lower bounds 3 Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 2 / 41

Preliminaries on adversarial robustness Classifier-dependent lower bounds Universal lower bounds Preliminaries on adversarial robustness Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 3 / 41

Preliminaries on adversarial robustness Classifier-dependent lower bounds Universal lower bounds Definition of adversarial attacks A classifier is trained and deployed (e.g the computer vision system on a self-driving car) At test / inference time, an attacker may submit queries to the classifier by sampling a real sample point x with true label k (e.g “pig”), modifying it x �→ x adv given to a prescribed threat model . Goal of attacker is to make classifier label x adv as � = k (e.g airliner) Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 4 / 41

Preliminaries on adversarial robustness Classifier-dependent lower bounds Universal lower bounds The flying pig! (Picture is courtesy of https: // gradientscience. org/ intro_ adversarial/ ) x �→ x adv := x + noise , � noise � ≤ ε = 0 . 005 (in example above) Fast Gradient Sign Method : noise = sign( ∇ x loss ( h ( x ) , y )) Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 5 / 41

Preliminaries on adversarial robustness Classifier-dependent lower bounds Universal lower bounds FGSM for generating adversarial examples [Goodfellow ’14] x �→ x adv := clip( x + ε sign( ∇ x loss ( h ( x ) , y ))) Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 6 / 41

Preliminaries on adversarial robustness Classifier-dependent lower bounds Universal lower bounds Adversarial attacks and defenses, an arms race! Image courtesy of [Goldstein’ 19; Shafahi ’19] Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 7 / 41

Problem setup Preliminaries on adversarial robustness No Free Lunch Theorems Classifier-dependent lower bounds The Strong No Free Lunch Theorem Universal lower bounds Corollaries Classifier-dependent lower bounds Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 8 / 41

Problem setup Preliminaries on adversarial robustness No Free Lunch Theorems Classifier-dependent lower bounds The Strong No Free Lunch Theorem Universal lower bounds Corollaries Problem setup A classifier is simply a Borel-measurable mapping h : X → Y from feature space X (with metric d ) to label space Y := { 1 , . . . , K } . A classifier is trained and deployed (e.g the computer vision system on a self-driving car) At test / inference time, an attacker may submit queries to the classifier by sampling a real sample point x ∈ X with true label k ∈ Y , and modifying it x �→ x adv according to a prescribed threat model. For example, modifying a few pixels on a road traffic sign [Su et al. ’17] Modifying intensity of pixels by a limited amount determined by a prescribed tolerance level [Tsipras ’18], etc., on it. Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 9 / 41

Problem setup Preliminaries on adversarial robustness No Free Lunch Theorems Classifier-dependent lower bounds The Strong No Free Lunch Theorem Universal lower bounds Corollaries Problem setup: notations Standard accuracy: acc( h | k ) := 1 − err( h | k ), where err( h | k ) := P X | k ( h ( X ) � = k ) is the error of h on class k . Small acc( h | k ) = ⇒ h is inaccurate on class k . Adversarial robustness accuracy: acc ε ( h | k ) := 1 − err ε ( h | k ), where err ε ( h | k ) := P X | k ( ∃ x ′ ∈ Ball( X ; ε ) | h ( x ′ ) � = k ) is the adversarial robustness error of h on class k . Small acc ε ( h | k ) = ⇒ h is vulnerable to attacks on class k . Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 10 / 41

Problem setup Preliminaries on adversarial robustness No Free Lunch Theorems Classifier-dependent lower bounds The Strong No Free Lunch Theorem Universal lower bounds Corollaries Problem setup: notations Standard accuracy: acc( h | k ) := 1 − err( h | k ), where err( h | k ) := P X | k ( h ( X ) � = k ) is the error of h on class k . Small acc( h | k ) = ⇒ h is inaccurate on class k . Adversarial robustness accuracy: acc ε ( h | k ) := 1 − err ε ( h | k ), where err ε ( h | k ) := P X | k ( ∃ x ′ ∈ Ball( X ; ε ) | h ( x ′ ) � = k ) is the adversarial robustness error of h on class k . Small acc ε ( h | k ) = ⇒ h is vulnerable to attacks on class k . Distance to error set: d ( h | k ) := E P X | k [ d ( X , B ( h , k ))] denotes the average distance of a sample point of true label k , from the error set B ( h , k ) := { x ∈ X | h ( x ) � = k } of samples assigned to another label. Small d ( h | k ) = ⇒ h is vulnerable to attacks on class k . Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 10 / 41

Problem setup Preliminaries on adversarial robustness No Free Lunch Theorems Classifier-dependent lower bounds The Strong No Free Lunch Theorem Universal lower bounds Corollaries A motivating example (from [Tsipras ’18]) Consider the following classification problem: Prediction target : Y ∼ Bern(1 / 2 , {± 1 } ) based on p ≥ 2 explanatory variables X := ( X 1 , X 2 , . . . , X p ) given by Robust feature : X 1 | Y = + Y w.p 70% and − Y w.p. 30%. Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 11 / 41

Problem setup Preliminaries on adversarial robustness No Free Lunch Theorems Classifier-dependent lower bounds The Strong No Free Lunch Theorem Universal lower bounds Corollaries A motivating example (from [Tsipras ’18]) Consider the following classification problem: Prediction target : Y ∼ Bern(1 / 2 , {± 1 } ) based on p ≥ 2 explanatory variables X := ( X 1 , X 2 , . . . , X p ) given by Robust feature : X 1 | Y = + Y w.p 70% and − Y w.p. 30%. Non-robust features : X j | Y ∼ N ( η Y , 1) , for j = 2 , . . . , p , where η ∼ p − 1 / 2 is a fixed scalar which controls the difficulty. Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 11 / 41

Problem setup Preliminaries on adversarial robustness No Free Lunch Theorems Classifier-dependent lower bounds The Strong No Free Lunch Theorem Universal lower bounds Corollaries A motivating example (from [Tsipras ’18]) Consider the following classification problem: Prediction target : Y ∼ Bern(1 / 2 , {± 1 } ) based on p ≥ 2 explanatory variables X := ( X 1 , X 2 , . . . , X p ) given by Robust feature : X 1 | Y = + Y w.p 70% and − Y w.p. 30%. Non-robust features : X j | Y ∼ N ( η Y , 1) , for j = 2 , . . . , p , where η ∼ p − 1 / 2 is a fixed scalar which controls the difficulty. The linear classifier h lin ( x ) ≡ sign( w T x ) with w = (0 , 1 / p , . . . , 1 / p ), where we allow ℓ ∞ -perturbations of maximum size ε ≥ 2 η , solves the problem perfectly (100% accuracy) but its adversarial robustness is zero ! Elvis Dohmatob Limits on Robustness to Adversarial Examples – slide 11 / 41

Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo - PowerPoint PPT Presentation

Preliminaries on adversarial robustness Classifier-dependent lower bounds Universal lower bounds Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis Dohmatob Limits on Robustness to Adversarial

City Limits Lions Clubs City Limits Lions Clubs City Limits Lions Clubs City Limits Lions

Different Types of Limits Besides ordinary, two-sided limits, there are one-sided limits (left-

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

MAT 166 Calculus for Bus/Soc Chapter 3 Notes Limits The Deriviative David J. Gisch Limits

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness? Robustness ? Robustness?

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Adversarial Robustness for Code Pavol Bielik , Martin Vechev pavol.bielik@inf.ethz.ch,

Adversarial Robustness of Machine Learning Models for Graphs Prof. Dr. Stephan Gnnemann

Adversarial Domain Adaptation and Adversarial Robustness Judy Hoffman + = Big Deep success

Adversarial Approaches to Bayesian Learning and Bayesian Approaches to Adversarial Robustness

transparency for legal machines Vytautas YRAS Vilnius University Vytautas.Cyras@mif.vu.lt

CS61A Lecture #35: Cryptography Announcements: HKN surveys next Friday: 7.5 bonus points for

Encrypted Search: Leakage Suppression Seny Kamara How Should we Handle Leakage? Approach #1:

Cyber@UC Meeting 74 Mitre Framework If Youre New! Join our Slack: cyberatuc.slack.com

Dienstag 30. Nov. 10 [] All Your Baseband Are over-the-air exploitation of memory corruptions in

OpenBSC network-side GSM stack A tool for GSM protocol level security analysis Harald Welte

D ISTRIBUTED S YSTEMS [COMP9243] Lecture 8: Fault Tolerance C ASE S TUDY : AWS FAILURE 2011

SMA Computer Science Seminar EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive

Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo - PowerPoint PPT Presentation

Preliminaries on adversarial robustness Classifier-dependent lower bounds Universal lower bounds Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis Dohmatob Limits on Robustness to Adversarial

City Limits Lions Clubs City Limits Lions Clubs City Limits Lions Clubs City Limits Lions

Different Types of Limits Besides ordinary, two-sided limits, there are one-sided limits (left-

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

MAT 166 Calculus for Bus/Soc Chapter 3 Notes Limits The Deriviative David J. Gisch Limits

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &amp;

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness? Robustness ? Robustness?

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Adversarial Robustness for Code Pavol Bielik , Martin Vechev pavol.bielik@inf.ethz.ch,

Adversarial Robustness of Machine Learning Models for Graphs Prof. Dr. Stephan Gnnemann

Adversarial Domain Adaptation and Adversarial Robustness Judy Hoffman + = Big Deep success

Adversarial Approaches to Bayesian Learning and Bayesian Approaches to Adversarial Robustness

transparency for legal machines Vytautas YRAS Vilnius University Vytautas.Cyras@mif.vu.lt

CS61A Lecture #35: Cryptography Announcements: HKN surveys next Friday: 7.5 bonus points for

Encrypted Search: Leakage Suppression Seny Kamara How Should we Handle Leakage? Approach #1:

Cyber@UC Meeting 74 Mitre Framework If Youre New! Join our Slack: cyberatuc.slack.com

Dienstag 30. Nov. 10 [] All Your Baseband Are over-the-air exploitation of memory corruptions in

OpenBSC network-side GSM stack A tool for GSM protocol level security analysis Harald Welte

D ISTRIBUTED S YSTEMS [COMP9243] Lecture 8: Fault Tolerance C ASE S TUDY : AWS FAILURE 2011

SMA Computer Science Seminar EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &