A geometric perspective on the robustness of deep networks - PowerPoint PPT Presentation

A geometric perspective on   the robustness of deep networks Seyed-Mohsen Moosavi-Dezfooli Amirkabir Artificial Intelligence Summer Summit July 2019

Tehran Polytechnic Iran 2

EPFL Lausanne, Switzerland 3

Pascal Frossard Alhussein Fawzi Stefano Soatto Omar Fawzi EPFL Google DeepMind UCLA ENS-Lyon Collaborators Jonathan Uesato Can Kanbak Apostolos Modas Google DeepMind Bilkent EPFL

Google Research Rachel Jones—Biomedical Computation Review Are we ready? David Paul Morris—Bloomberg/Getty Images

School Bus x + ˆ k ( · ) r = Ostrich Adversarial ▪ Intriguing properties of neural networks , perturbations Szegedy et al., ICLR 2014.

Joystick Flag pole Ballon Face powder Universal (adversarial) Labrador Chihuahua ▪ Universal adversarial perturbations , Chihuahua Labrador perturbations 7 Moosavi et al., CVPR 2017.

“Geometry is not true, it is advantageous.” Henri Poincaré “ 8

Adversarial perturbations How large is the “space” of adversarial examples? Universal perturbations What causes the vulnerability of deep networks to universal perturbations? Adversarial training What geometric features contribute to a better robustness properties? Geometry of … 9

Geometry of adversarial perturbations 10

r ∗ = argmin k r k 2 s.t. ˆ k ( x + r ) 6 = ˆ k ( x ) r x 0 x + r ∗ Geometric interpretation x ∈ R d of adversarial perturbations

Adversarial examples are “blind spots”. ▪ Intriguing properties of neural networks ,   Szegedy et al., ICLR 2014 . Deep classifiers are “too linear”. ▪ Explaining and harnessing Two adversarial examples ,   hypotheses Goodfellow et al., ICLR 2015 .

r v T x B x U Normal cross- sections of decision ▪ Robustness of classifiers:   from adversarial to random noise,   boundary Fawzi, Moosavi , Frossard, NIPS 2016 .

Decision boundary of CNNs is almost flat along random directions. 2 1 B 1 0 B 2 -1 x Curvature of -2 -100 -50 0 50 100 150 decision boundary of ▪ Robustness of classifiers:   from adversarial to random noise,   deep nets Fawzi, Moosavi , Frossard, NIPS 2016 .

Adversarial perturbations constrained to a random subspace of dimension m . r ∈ S k r k s.t. ˆ k ( x + r ) 6 = ˆ r S ( x ) = arg min k ( x ) x ∗ For low curvature classifiers, w.h.p., we have Space of r ! d r S ( x ) = Θ mr ( x ) r ∗ r ∗ adversarial S x S perturbations

The “space” of adversarial examples is quite vast. + = Flowerpot Pineapple Structured additive ▪ Robustness of classifiers:   from adversarial to random noise,   perturbations Fawzi, Moosavi , Frossard, NIPS 2016 .

Geometry of adversarial examples Decision boundary is “locally” almost flat. Datapoints lie close to the decision boundary. Flatness can be used to construct diverse set of perturbations. Summary design efficient attacks.

Geometry of universal perturbations 18

Joystick Flag pole Ballon Face powder 85 % Universal adversarial perturbations Labrador Chihuahua Chihuahua Labrador ▪ Universal adversarial perturbations , (UAP) 19 Moosavi et al., CVPR 2017.

Diversity of UAPs VGG-19 VGG-16 VGG-F CaffeNet Diversity of perturbations ResNet-152 GoogLeNet 20

Curved Flat model model Why do universal perturbations exist? 21

▪ Robustness of classifiers to universal perturbations , Flat model Moosavi et al., ICLR 2018. 22

Normals to the decision boundary are “globally” correlated. Plot of singular values Normals of the decision boundary Random vectors Flat model 1 50’000 ▪ Robustness of classifiers to universal perturbations , (cont’d) Moosavi et al., ICLR 2018. 23

The flat model only partially explains the universality. UAP Random (greedy algorithm) 13 % 38 % 85 % Flat model ▪ Robustness of classifiers to universal perturbations , (cont’d) Moosavi et al., ICLR 2018. 24

The principal curvatures of the decision boundary: 0.0 0 500 1000 1500 2000 2500 3000 ▪ Robustness of classifiers to universal perturbations , Curved model Moosavi et al., ICLR 2018. 25

The principal curvatures of the decision boundary: n v x 0.0 Curved model 0 500 1000 1500 2000 2500 3000 ▪ Robustness of classifiers to universal perturbations , (cont’d) Moosavi et al., ICLR 2018. 26

The principal curvatures of the decision boundary: n x v 0.0 Curved model 0 500 1000 1500 2000 2500 3000 ▪ Robustness of classifiers to universal perturbations , (cont’d) Moosavi et al., ICLR 2018. 27

The principal curvatures of the decision boundary: n x v 0.0 Curved model 0 500 1000 1500 2000 2500 3000 ▪ Robustness of classifiers to universal perturbations , (cont’d) Moosavi et al., ICLR 2018. 28

Normal sections of the decision boundary (for different datapoints) along a single direction: UAP direction Random direction Curved directions are ▪ Robustness of classifiers to universal perturbations , shared Moosavi et al., ICLR 2018. 29

The curved model better explains the existence of universal perturbations. Flat Curved Random model model UAP 67 % 85 % 13 % 38 % Curved directions are shared ▪ Robustness of classifiers to universal perturbations , (cont’d) Moosavi et al., ICLR 2018. 30

Universality of perturbations Shared curved directions explain this vulnerability. A possible solution Regularizing the geometry to combat against universal perturbations. Why are deep nets curved? ▪ With friends like these, who needs adversaries? ,   Jetley et al., NeurIPS 2018. Summary 31

Geometry of adversarial training 32

Adversarial Curvature training regularization Image batch x Adversarial perturbations Training x + r In a nutshell

One of the most effective methods to improve adversarial robustness… Image batch x Adversarial perturbations Training x + r Adversarial ▪ Obfuscated gradients give a false sense of security , training Athalye et al., ICML 2018 . (Best paper)

Curvature profiles of normally and adversarially trained networks: Normal Adversarial 0.0 Geometry of adversarial 0 500 1000 1500 2000 2500 3000 ▪ Robustness via curvature regularisation, and vice versa , training Moosavi et al., CVPR 2019 .

Normal Adversarial CURE training training 94.9 % 81.2 % 79.4 % Clean 43.7 % 0.0 % 36.3 % PGD with Curvature k r ∗ k ∞ = 8 Regularization ▪ Robustness via curvature regularisation, and vice versa , (CURE) Moosavi et al., CVPR 2019 .

AT CURE Implicit regularization Explicit regularization Time consuming 3x to 5x faster SOTA robustness On par with SOTA ▪ Robustness via curvature regularisation, and vice versa , AT vs CURE Moosavi et al., CVPR 2019 .

Inherently more robust classifiers Curvature regularization can significantly improve the robustness properties. Counter-intuitive observation Due to a more linear nature, an adversarially trained net is “easier” to fool. A better trade-off? ▪ Adversarial Robustness through Local Linearization ,   Qin et al., arXiv. Summary

Future challenges 39

Architectures Batch-norm, dropout, depth, width, etc. Data # of modes, convexity, distinguishability, etc. Disentangling Training different Batch size, solver, learning rate, etc. factors 40

▪ Geometric robustness of deep networks ,   Canbak, Moosavi , Frossard, CVPR 2018 . Bear Fox ▪ Spatially transformed Beyond adversarial examples ,   Xiao et al., ICLR 2018 . additive “0” “2” perturbations 41

Original Standard Adversarial image training training “Interpretability” ▪ Robustness may be at odds with accuracy , and robustness 42 Tsipras et al., NeurIPS 2018.

ETHZ Zürich, Switzerland Google Zürich 43

Interested in my research? smoosavi.me moosavi.sm@gmail.com 44

A geometric perspective on the robustness of deep networks - PowerPoint PPT Presentation

A geometric perspective on the robustness of deep networks Seyed-Mohsen Moosavi-Dezfooli Amirkabir Artificial Intelligence Summer Summit July 2019 Tehran Polytechnic Iran 2 EPFL Lausanne, Switzerland 3 Pascal Frossard Alhussein

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness? Robustness ? Robustness?

Numerical Robustness (for Geometric Calculations) Christer Ericson Sony Computer Entertainment

Robustness and Generalization Huan Xu The University of Texas at Austin Department of Electrical

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Deep networks CS 446 The ERM perspective These lectures will follow an ERM perspective on deep

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Geometric Optimization Piotr Indyk April 26, 2005 Lecture 19: Geometric Optimization Geometric

Geometric Algebra A powerful tool for solving geometric problems in visual computing Leandro A.

Perspective LanguaL Structured Vocabulary: USDA Perspective Joanne Holden Perspective: Earth

Trade-off between Efficiency and Robustness Doctoral Colloqium @ SenSys18, Shenzhen Robert

Algorithms in Nature Network robustness Slides adapted from Carl Kingsford Network robustness

Geometric Routing in Sensor Networks III: Geometric Routing in Sensor Networks III: Explore the

Geometric Spanner Networks Spanner Networks M. Farshi Course Outline Mohammad Farshi Textbook

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Information Propagation on Blockchains: Analysis,Method and Evaluation

4th system upgrade of Tokyo Tier2 center Tomoaki Nakamura KEK-CRC / ICEPP UTokyo 2016/03/18

Caosdb Alexander Schlemmer (MPIDS, G ottingen) CaosDB 1 / 25 CaosDB Alexander Schlemmer

Implementation Strategies for Building Communication and Learning through AAC in the Classroom

Tropical complexes Dustin Cartwright Yale University October 20, 2012 Dustin Cartwright (Yale

KN really energy dependent ? J. Rvai MTA Wigner RCP, Budapest, Hungary Few Body Systems 59

Graduate Entry Medical School Clinical and Anatomical Laboratory Guide GEMSD0009.3 Primary

THE SOLSA PROJECT: Combined techniques and databases for mineral identification Yassine El