Featur ure Deno noising ng for r Impr provi ving ng Adv - PowerPoint PPT Presentation

Featur ure Deno noising ng for r Impr provi ving ng Adv dversari rial Robus bustne ness Cihang Xie Johns Hopkins University

● Background ● Towards Robust Adversarial Defense

Deep networks are Go Good Deep Label: King Penguin Networks

FRAGILE to small & carefully crafted perturbations Deep networks are FR Deep Label: King Penguin Networks Label: Chihuahua

FRAGILE to small & carefully crafted perturbations Deep networks are FR We call such images as Adversarial Examples

Adversarial Examples can exist on Di Differ eren ent Tasks semantic segmentation pose estimation text classification [1] Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, and Alan Yuille. "Adversarial examples for semantic segmentation and object detection." In ICCV . 2017. [2] Moustapha Cisse, Yossi Adi, Natalia Neverova, and Joseph Keshet. "Houdini: Fooling deep structured prediction models." In NeurIPS. 2018. [3] Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. "HotFlip: White-Box Adversarial Examples for Text Classification." In ACL . 2018.

Adversarial Examples can be created other than Addi Adding ng Pe Perturbation person : 0.817 bird : 0.342 person : 0.736 bird : 0.629 bird : 0.628 tvmonitor : 0.998 bird : 0.663 [4] Chaowei Xiao, Jun-Yan Zhu, Bo Li, Warren He, Mingyan Liu, and Dawn Song. "Spatially transformed adversarial examples." In ICLR. 2018. [5] Jianyu Wang, Zhishuai Zhang, Cihang Xie, et al. "Visual concepts and compositional voting." In Annals of Mathematical Sciences and Applications. 2018 .

Adversarial Examples can exist on Th The Physic ical al World ld [6] Lifeng Huang, Chengying Gao, Yuyin Zhou, Changqing Zou, Cihang Xie, Alan Yuille, Ning Liu. "UPC: Learning Universal Physical Camouflage Attacks on Object Detectors," Arxiv, 2019

Generating Adversarial Example is SI PLE : SIMPL maximize loss(f(x+ r ), y true ; θ) Maximize the loss function w.r.t. Adversarial Perturbation r

Generating Adversarial Example is SI PLE : SIMPL maximize loss(f(x+ r ), y true ; θ) Maximize the loss function w.r.t. Adversarial Perturbation r minimize loss(f(x), y true ; θ ); Minimize the loss function w.r.t. Network Parameters θ

● Background ● Towards Robust Adversarial Defense Deep Networks Label: King Penguin

SMALL on the pixel space Ob Obser ervation on : Adversarial perturbations are SM 4 3 3 3 Clean 2 clean 2 2 1 1 1 0 0 0 4 3 3 adversarial 3 2 2 Adversarial 2 1 1 1 0 0 0

BIG on the feature space Ob Obser ervation on : Adversarial perturbations are BI 4 3 3 3 Clean 2 clean 2 2 1 1 1 0 0 0 4 3 3 adversarial 3 2 2 Adversarial 2 1 1 1 0 0 0

BIG on the feature space Obser Ob ervation on : Adversarial perturbations are BI 4 3 3 3 Clean 2 clean 2 2 1 1 1 0 0 0 4 3 3 adversarial 3 2 2 Adversarial 2 1 1 1 0 0 0 We should DENOISE these feature maps

Ou Our Sol olution on : Den Denoi oising at fea eature e level el Tradition onal Image Denoi oising g Operation ons : Local filters (predefine a local region Ω " for each pixel i): & '() * ) ∑ ∀.∈0 $ 1 2 $ , 2 . 2 . Bilateral filter # $ = ● Median filter # $ = 456"78 ∀9 ∈ Ω " : 2 . ● & '() * ) ∑ ∀.∈0 $ 2 . Mean filter # $ = ● Non-local filters (the local region Ω " is the whole image I): & '() * ) ∑ ∀.∈; 1 2 $ , 2 . 2 . Non-local means # $ = ●

Denoising Block ck Design 1×1 conv Denoising operations may lose information • we add a residual connection to balance the tradeoff between denoising removing noise and retaining original signal operation

Tr Training Strategy : Adversarial training Core Idea: train with adversarial examples ●

Training Strategy : Adversarial training Tr Core Idea: train with adversarial examples ● min θ max loss(f(x+ r ),ytrue; θ) & max step: generate adversarial perturbation

Tr Training Strategy : Adversarial training Core Idea: train with adversarial examples ● min θ max loss(f(x+r),ytrue; θ ) & max step: generate adversarial perturbation min step: optimize network parameters

Tw Two Ways for Evaluating Robustness Defending Against White-box Attacks Attackers know everything about models ● Directly maximize loss(f(x+ r ), y true ; θ) ●

Tw Two Ways for Evaluating Robustness Defending Against White-box Attacks Attackers know everything about models ● Directly maximize loss(f(x+ r ), y true ; θ) ● Defending Against Blind Attacks Attackers know nothing about models ● Attackers generate adversarial examples using substitute networks ● ( rely on transferability )

De Defen ending Against White-box x Attacks Evaluating against adversarial attackers with attack iteration up to 2000 ● ( more attack iterations indicate stronger attacks )

De Defen ending Against White-box x Attacks – Pa Part I ALP, Inception-v3 55 ours, R-152 baseline A successful adversarial training can 50 give us a STRONG baseline 45 accuracy (%) 41.7 41.7 2000-iter PGD attack 40.4 40.4 39.6 39.6 39.2 39.2 40 38.9 38.9 35 30 27.9 27.9 ALP ALP 25 10 100 200 400 600 800 1000 1200 1400 1600 1800 2000 attack iterations

De Defen ending Against White-box x Attacks – Pa Part I ALP, Inception-v3 55 ours, R-152 baseline ours, R-152 denoise 50 45.5 45.5 44.4 44.4 2000-iter PGD attack 2000-iter PGD attack 45 43.3 43.3 accuracy (%) 42.8 42.8 42.6 42.6 41.7 41.7 40.4 40.4 39.6 39.6 39.2 39.2 40 38.9 38.9 Feature Denoising can give us additional benefits 35 30 27.9 27.9 ALP ALP 25 10 100 200 400 600 800 1000 1200 1400 1600 1800 2000 attack iterations

De Defen ending Against White-box x Attacks – Pa Part II 62 ResNet-152 baseline +4 bottleneck (ResNet-164) 60 +4 denoise: null (1x1 only) +4 denoise: 3x3 mean 58 +4 denoise: 3x3 median 56 +4 denoise: bilateral, dot prod 55.7 55.7 +4 denoise: bilateral, gaussian 54 +4 denoise: nonlocal, dot prod 53.5 53.5 accuracy (%) +4 denoise: nonlocal, gaussian 52.5 52.5 52 50 48 46 45.5 45.5 All denoising operations can help 44 43.4 43.4 42 41.7 41.7 10 20 30 40 50 60 70 80 90 100 attack iterations

De Defen ending Against White-box x Attacks – Pa Part III 62 ResNet-152 ResNet-152, denoise 60 ResNet-638 57.3 57.3 58 56 55.7 55.7 54 accuracy (%) 52.5 52.5 52 50 48 46.1 46.1 Feature Denoising is nearly as powerful 46 45.5 45.5 as adding ~500 additional layers 44 42 41.7 41.7 10 20 30 40 50 60 70 80 90 100 attack iterations

De Defen ending Against White-box x Attacks – Pa Part III 62 61.3 61.3 ResNet-152 ResNet-152, denoise 60 ResNet-638 57.3 57.3 ResNet-638, denoise* 58 56 55.7 55.7 54 accuracy (%) 52.5 52.5 52 49.9 49.9 50 Feature Denoising can still provide 48 benefits for the VERY deep ResNet-638 46.1 46.1 46 45.5 45.5 44 42 41.7 41.7 10 20 30 40 50 60 70 80 90 100 attack iterations

De Defen ending Against Blind Attacks Offline evaluation against 5 BEST attackers from NeurIPS Adversarial Competition 2017 ● Online competition against 48 UNKNOWN attackers in CAAD 2018 ●

De Defen ending Against Blind Attacks Offline evaluation against 5 BEST attackers from NeurIPS Adversarial Competition 2017 ● Online competition against 48 UNKNOWN attackers in CAAD 2018 ● CAAD 2018 “all or nothing” criterion : an image is considered correctly classified only if the model correctly classifies all adversarial versions of this image created by all attackers

De Defen ending Against Blind Attacks --- --- CA CAAD 2 2017 O Offline E Evaluation

De Defen ending Against Blind Attacks --- --- CA CAAD 2 2018 O Online Co Competition 0 10 20 30 40 50 1st 50.6 2nd 40.8 3rd 8.6 4th 3.6 5th 0.6

Visualization Vi Adversarial Examples Before denoising After denoising 0.8 0.6 0.4 0.2 0 2.4 Denoising 1.8 Operations 1.2 0.6 0 1.5 1 0.5 0

Defending against adversarial attacks is still a long way to go…

Questions?

Featur ure Deno noising ng for r Impr provi ving ng Adv - PowerPoint PPT Presentation

Featur ure Deno noising ng for r Impr provi ving ng Adv dversari rial Robus bustne ness Cihang Xie Johns Hopkins University Background Towards Robust Adversarial Defense Deep networks are Go Good Deep Label: King Penguin

Table of Contents IMPR PROVI VING M MED EDICAL CAL D DEVI VICES CES W WITH TH F FORCE S

Welcome ome t to o Classe ssen Gra raphics cs - We Bring ng and F and Featur ure al all y

Using T Using T e c hnology to Impr e c hnology to Impr ove ove Me dic ation Adhe r Me dic

| Todays Speakers Brandon Deno Bruce Chandler Vice President, Solar Business Development,

Im Impr proving T ving Transp ranspor ortation tation for Can or Cancer P cer Pati

Urinary inary Tract ct Infec ections: tions: Impr provin ving g Clinical ical Mana nageme

Impr mproving DR ving DRAM P M Per erfor ormanc mance e by P y Par arallelizing R

Im Impr proving ving Stu Student dent Achie ievement ement in in th the e So Soci cial

Inc Inclus lusiv ive e educ educatio tion, n, a mea eans ns of im impr proving ving

WACAS March 2, 2014 Impr provin ving g Cover erage age an and d Rel elia iabil ilit ity

Key y Featur atures, es, Identif entification cation an and Test sting ing Methods hods an

Our Our Provi Provident dent NHA NHADEC DEC Ann Annual ual Me Meet eting ing Chuck k

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

De-noising on the Body Centered Cubic (BCC) Sampling Lattice Tai Meng CMPT775 2006/Spring

Linear filtering Applications De-noising Subhransu Maji Sharpening Edge detection

Over ervie iew w of SP and P and the e Adv dvis isor or Vie iew Adv dvis isin

Government Relations Report Yangyang Cheng USLUA Annual Meeting 10/26/2018 Outline Report

River Engineering Nothing in these lectures will be exact. We are talking about the modelling of

WELCOME! University of Birmingham Business Club Breakfast Briefing International Business

THE UNICORN IS DEAD Soft Skills Trump Coding Skills Paul Sherman OBJECTIVES Deconstruct the UX

The Coming Lord and the Departing Apostle 2019 TRINITY LECTURE 3 31 JULY 2019 MARKUS

Union-find 0 3 Review 4 1 2 Spanning trees o Edge-centric algorithm: O(ev) o

Two Nations, One Aquifer Re porting in Ne w Me xico & Me xico on borde r wate r issue s

Clustering SME with maquilas in a Clustering SME with maquilas in a local context: benefiting

Featur ure Deno noising ng for r Impr provi ving ng Adv - PowerPoint PPT Presentation

Featur ure Deno noising ng for r Impr provi ving ng Adv dversari rial Robus bustne ness Cihang Xie Johns Hopkins University Background Towards Robust Adversarial Defense Deep networks are Go Good Deep Label: King Penguin

Table of Contents IMPR PROVI VING M MED EDICAL CAL D DEVI VICES CES W WITH TH F FORCE S

Welcome ome t to o Classe ssen Gra raphics cs - We Bring ng and F and Featur ure al all y

Using T Using T e c hnology to Impr e c hnology to Impr ove ove Me dic ation Adhe r Me dic

| Todays Speakers Brandon Deno Bruce Chandler Vice President, Solar Business Development,

Im Impr proving T ving Transp ranspor ortation tation for Can or Cancer P cer Pati

Urinary inary Tract ct Infec ections: tions: Impr provin ving g Clinical ical Mana nageme

Impr mproving DR ving DRAM P M Per erfor ormanc mance e by P y Par arallelizing R

Im Impr proving ving Stu Student dent Achie ievement ement in in th the e So Soci cial

Inc Inclus lusiv ive e educ educatio tion, n, a mea eans ns of im impr proving ving

WACAS March 2, 2014 Impr provin ving g Cover erage age an and d Rel elia iabil ilit ity

Key y Featur atures, es, Identif entification cation an and Test sting ing Methods hods an

Our Our Provi Provident dent NHA NHADEC DEC Ann Annual ual Me Meet eting ing Chuck k

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

De-noising on the Body Centered Cubic (BCC) Sampling Lattice Tai Meng CMPT775 2006/Spring

Linear filtering Applications De-noising Subhransu Maji Sharpening Edge detection

Over ervie iew w of SP and P and the e Adv dvis isor or Vie iew Adv dvis isin

Government Relations Report Yangyang Cheng USLUA Annual Meeting 10/26/2018 Outline Report

River Engineering Nothing in these lectures will be exact. We are talking about the modelling of

WELCOME! University of Birmingham Business Club Breakfast Briefing International Business

THE UNICORN IS DEAD Soft Skills Trump Coding Skills Paul Sherman OBJECTIVES Deconstruct the UX

The Coming Lord and the Departing Apostle 2019 TRINITY LECTURE 3 31 JULY 2019 MARKUS

Union-find 0 3 Review 4 1 2 Spanning trees o Edge-centric algorithm: O(ev) o

Two Nations, One Aquifer Re porting in Ne w Me xico &amp; Me xico on borde r wate r issue s

Clustering SME with maquilas in a Clustering SME with maquilas in a local context: benefiting

Two Nations, One Aquifer Re porting in Ne w Me xico & Me xico on borde r wate r issue s