Robustness and geometry of deep neural networks Alhussein Fawzi DeepMind May 23rd 2019 The Mathematics of Deep Learning and Data Science University of Cambridge 1
Recent advances in machine learning Error rate (%) He et., al., “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification” , 2015 2 Karpathy et., al, “Automated Image Captioning with ConvNets and Recurrent Nets” LeCun et. al.,, “Deep Learning” , 2015 DeepMind https://deepmind.com/research/alphago/
Robustness of classifiers to perturbations In real world environments, images undergo perturbations. 3
Robustness of classifiers to perturbations In real world environments, images undergo perturbations. Lampshade 3
Robustness of classifiers to perturbations In real world environments, images undergo perturbations. Perturbation Lampshade 3
Robustness of classifiers to perturbations In real world environments, images undergo perturbations. Perturbation Lampshade 3
Robustness of classifiers to perturbations In real world environments, images undergo perturbations. Perturbation f Lampshade 3
Robustness of classifiers to perturbations In real world environments, images undergo perturbations. Perturbation f Lampshade? Lampshade 3
Robustness of classifiers to perturbations In real world environments, images undergo perturbations. Perturbation f Lampshade? Lampshade Broad range of perturbations Adversarial perturbations [Szegedy et. al. ICLR 2014], [Biggio et. al., PKDD 2013], ... Random noise [Fawzi et. al., NIPS 2016], [Franceschi et. al., AISTATS 2018] Structured nuisances (geometric transformations [Bruna et. al., TPAMI 2013], [Jaderberg et. al., NIPS 2015] , occlusions [Sharif et. al., CCS 2016] , etc...). 3
Robustness of classifiers to perturbations (Cont’d) Safety of machine learning systems ? 4
Robustness of classifiers to perturbations (Cont’d) Safety of machine learning systems ? Better understanding of the geometry of state-of-the-art classifiers. Class 2 Class 1 4
Talk outline 1 Fooling classifiers is easy: vulnerability to different perturbations. 2 Improving the robustness (i.e., “defending”) is difficult. 3 Geometric analysis of a successful defense: adversarial training. 5
Adversarial perturbations State-of-the-art deep neural networks have been shown to be surprisingly unstable to adversarial perturbations. 6
Adversarial perturbations State-of-the-art deep neural networks have been shown to be surprisingly unstable to adversarial perturbations. School bus 6
Adversarial perturbations State-of-the-art deep neural networks have been shown to be surprisingly unstable to adversarial perturbations. School bus Ostrich 6
Adversarial perturbations State-of-the-art deep neural networks have been shown to be surprisingly unstable to adversarial perturbations. School bus Perturbation Ostrich Figure from [Szegedy et. al., ICLR 2014]. 6
Adversarial perturbations State-of-the-art deep neural networks have been shown to be surprisingly unstable to adversarial perturbations. School bus Perturbation Ostrich Figure from [Szegedy et. al., ICLR 2014]. Adversarial examples are found by seeking the minimal perturbation (in the ℓ 2 sense) that switches the label of the classifier. 6
Adversarial perturbations Robustness to adversarial noise r ∗ ( ) x x r ∗ ( x ) = min � r � 2 subject to f ( x + r ) � = f ( x ) . r 7
Other types of adversarial perturbations Universal perturbations [Moosavi-Dezfooli et. al., 2017] C h ihuah u a Joyst i ck L a b r a d o r Flagpole T e B all o on r r i e r Geometric transformations [Fawzi et. al., 2015, Moosavi-Dezfooli et. al., 2018, Xiao et al., 2018] 8
Finding adversarial perturbations is easy... http://robust.vision 9
... but designing defense mechanisms is hard! Despite the huge number of proposed defenses, state-of-the-art classifiers are still vulnerable to small perturbations. 10
... but designing defense mechanisms is hard! Despite the huge number of proposed defenses, state-of-the-art classifiers are still vulnerable to small perturbations. 10
... but designing defense mechanisms is hard! Despite the huge number of proposed defenses, state-of-the-art classifiers are still vulnerable to small perturbations. 10
... but designing defense mechanisms is hard! Despite the huge number of proposed defenses, state-of-the-art classifiers are still vulnerable to small perturbations. 10
Adversarial training 11
Adversarial training Adversarial accuracy (CIFAR-10): 11
Adversarial training Adversarial accuracy (CIFAR-10): [Madry et. al., 2017] 11
Adversarial training Adversarial accuracy (CIFAR-10): [Madry et. al., 2017] Adversarial training leads to state-of-the art robustness to adversarial perturbations. 11
Adversarial training Adversarial accuracy (CIFAR-10): [Madry et. al., 2017] Adversarial training leads to state-of-the art robustness to adversarial perturbations. But what does it actually do? 11
Decision boundaries Adv. direction Random direction 12
Decision boundaries Adv. direction Normal training Random direction 12
Decision boundaries Adv. direction Normal training Adversarial training Random direction 12
Decision boundaries Adv. direction Normal training Adversarial training Random direction After adversarial training, the decision boundaries are flatter and more regular. 12
Effect of adversarial training on loss landscape Logit Label 13
Effect of adversarial training on loss landscape Logit Label Before adv. fine-tuning 13
Effect of adversarial training on loss landscape Logit Label Before adv. fine-tuning After adv. fine-tuning 13
Effect of adversarial training on loss landscape (Cont’d) 14
Quantitative analysis: curvature decrease with adversarial training We compute the Hessian matrix at a test point x with respect to inputs . � ∂ 2 ℓ � H = ∂ x i ∂ x j The eigenvalues of H are the curvature of ℓ in the vicinity of x . 15
Quantitative analysis: curvature decrease with adversarial training We compute the Hessian matrix at a test point x with respect to inputs . � ∂ 2 ℓ � H = ∂ x i ∂ x j The eigenvalues of H are the curvature of ℓ in the vicinity of x . Eigenvalue profile Original Adversarial 1.5 1.0 Value 0.5 0.0 -0.5 0 500 1000 1500 2000 2500 3000 Eigenvalue number 15
Quantitative analysis: curvature decrease with adversarial training We compute the Hessian matrix at a test point x with respect to inputs . � ∂ 2 ℓ � H = ∂ x i ∂ x j The eigenvalues of H are the curvature of ℓ in the vicinity of x . Eigenvalue profile Original Adversarial 1.5 1.0 Value 0.5 0.0 -0.5 0 500 1000 1500 2000 2500 3000 Eigenvalue number 15
Relation between curvature and robustness Locally quadratic approximation of the loss function 16
Relation between curvature and robustness Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier. 16
Relation between curvature and robustness Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier. Threshold 16
Relation between curvature and robustness Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier. Threshold 16
Relation between curvature and robustness Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier. Threshold 16
Relation between curvature and robustness Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier. Threshold 16
Relation between curvature and robustness Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier. Threshold Robustness Upper bound Lower bound Curvature 16
How important is the curvature decrease? Is the curvature decrease the main effect of adversarial training leading to improved robustness? 17
How important is the curvature decrease? Is the curvature decrease the main effect of adversarial training leading to improved robustness? → we regularize explicitly for the curvature. 17
How important is the curvature decrease? Is the curvature decrease the main effect of adversarial training leading to improved robustness? → we regularize explicitly for the curvature. Idea: Regularize the norm of the Hessian of the loss wrt inputs. 17
How important is the curvature decrease? Is the curvature decrease the main effect of adversarial training leading to improved robustness? → we regularize explicitly for the curvature. Idea: Regularize the norm of the Hessian of the loss wrt inputs. Use Hutichinson’s estimator � � Hz � 2 � H � F = E 2 z ∼N (0 , I ) 17
Recommend
More recommend