Robust Attribution Regularization 2 , Yingyu Liang 1 , Jiefeng Chen - PowerPoint PPT Presentation

Robust Attribution Regularization † 2 , Yingyu Liang 1 , Jiefeng Chen *1 , Xi Wu *2 , Vaibhav Rastogi Somesh Jha 1,3 1 University of Wisconsin-Madison 2 Google 3 XaiPient NeurIPS’2019 *Equal contribution † Work done while at UW-Madison

Machine Learning Progress • Significant progress in Machine Learning Machine translation Computer vision Game Playing Medical Imaging

Key Engine Behind the Success • Training Deep Neural Networks: 𝑧 = 𝑔(𝑦; 𝑋) • Given training data { 𝑦 + , 𝑧 + , 𝑦 - , 𝑧 - , … , 𝑦 / , 𝑧 / } • Try to find 𝑋 such that the network fits the data Outdoor Indoor … … … … Outdoor … …

Key Engine Behind the Success • Using Deep Neural Networks: 𝑧 = 𝑔(𝑦; 𝑋) • Given a new test point 𝑦 • Predict 𝑧 = 𝑔(𝑦; 𝑋) … … … … Outdoor … …

Challenges • Blackbox: not too much understanding/interpretation Windflower Black Box • Vulnerable to adversaries

Interpretable Machine Learning • Attribution task: Given a model and an input, compute an attribution map measuring the importance of different input dimensions Windflower Machine Learning Model Compute Attribution

Integrated Gradient: Axiomatic Approach Overview • List desirable criteria (axioms) for an attribution method • Establish a uniqueness result: only this method satisfies these desirable criteria • Inspired by economics literature: Values of Non-Atomic Games . Aumann and Shapley, 1974. Axiomatic Attribution for Deep Networks. Mukund Sundararajan, Ankur Taly, Qiqi Yan. ICML 2017.

Integrated Gradient: Definition

Integrated Gradient: Example Results

Integrated Gradient: Axioms • Implementation Invariance: Two networks that compute identical functions for all inputs get identical attributions even if their architecture/parameters differ • Sensitivity: • (a) If baseline and input have different scores, but differ in a single variable, then that variable gets some attribution • (b) If a variable has no influence on a function, then it gets no attribution • Linearity preservation: Attr(a*f1 + b*f2)=a*Attr(f1)+b*Attr(f2) • Completeness: sum(Attr) = f(input) – f(baseline) • Symmetry Preservation: Symmetric variables with identical values get equal attributions

Attribution is Fragile Model small Very windflower adversarial Different perturbation Model Interpretation of Neural Networks is Fragile. Amirata Ghorbani, Abubakar Abid, James Zou. AAAI 2019.

Robust Prediction Correlates with Robust Attribution: Why? • Training for robust prediction: find a model that predicts the same label for all perturbed images around the training image original image, normally trained model perturbed image, normally trained model

Robust Prediction Correlates with Robust Attribution: Why? • Training for robust prediction: find a model that predicts the same label for all perturbed images around the training image original image, robustly trained model perturbed image, robustly trained model

Robust Attribution Regularization • Training for robust attribution: find a model that can get similar attributions for all perturbed images around the training image min 4 𝔽 𝑚 𝒚, 𝑧; 𝜄 + 𝜇 ∗ RAR RAR = max 𝒚 @ ∈B(𝒚) 𝑡(IG(𝒚, 𝒚′)) Perturbed input Allowed perturbations

Robust Attribution Regularization • Training for robust attribution: find a model that can get similar attributions for all perturbed images around the training image min 4 𝔽 𝑚 𝒚, 𝑧; 𝜄 + 𝜇 ∗ RAR RAR = max 𝒚 @ ∈B(𝒚) 𝑡(IG(𝒚, 𝒚′)) Size function Integrated Gradient

Robust Attribution Regularization • Training for robust attribution: find a model that can get similar attributions for all perturbed images around the training image min 4 𝔽 𝑚 𝒚, 𝑧; 𝜄 + 𝜇 ∗ RAR RAR = max 𝒚 @ ∈B(𝒚) 𝑡(IG(𝒚, 𝒚′)) • Two instantiations: 𝒚 @ ∈B(𝒚) IG 𝒚, 𝒚 G IG-NORM = max + 𝒚 @ ∈B(𝒚) IG 𝒚, 𝒚 G IG-SUM-NORM = max + + sum(IG(𝒚, 𝒚′))

Experiments: Qualitative Flower dataset

Experiments: Qualitative MNIST dataset

Experiments: Qualitative Fashion-MNIST dataset

Experiments: Qualitative GTSRB dataset

Experiments: Quantitative • Metrics for attribution robustness 1. Kendall’s tau rank order correlation 2. Top-K intersection Original Image Attribution Map Perturbed Image Attribution Map Top-1000 Intersection: 0.1% Kendall’s Correlation: 0.2607

Result on Flower dataset

Result on MINST dataset

Result on Fashion-MINST dataset

Result on GTSRB dataset

Prediction Accuracy of Different Models Dataset Approach Accuracy NATURAL 99.17% MNIST IG-NORM 98.74% IG-SUM-NORM 98.34% NATURAL 90.86% Fashion-MNIST IG-NORM 85.13% IG-SUM-NORM 85.44% NATURAL 98.57% GTSRB IG-NORM 97.02% IG-SUM-NORM 95.68% NATURAL 86.76% Flower IG-NORM 85.29% IG-SUM-NORM 82.35%

Connection to Robust Prediction • RAR min 4 𝔽 𝑚 𝒚, 𝑧; 𝜄 + 𝜇 ∗ RAR RAR = max 𝒚 @ ∈B(𝒚) 𝑡(IG(𝒚, 𝒚′)) • If 𝜇 = 1 and 𝑡 ⋅ = 𝑡𝑣𝑛(⋅) , then RAR becomes the Adversarial Training objective for robust prediction 𝒚 @ ∈N(𝒚,O) 𝑚(𝒚 G , 𝑧; 𝜄) min 4 𝔽 max simply by the Completeness of IG Towards Deep Learning Models Resistant to Adversarial Attacks. Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu. ICML 2017.

When the two coincide? • Theorem: For the special case of one-layer neural networks (linear function), the robust attribution instantiation ( s ⋅ = ⋅ + ) and the robust prediction instantiation ( s ⋅ = sum(⋅) ) coincide, and both reduce to soft max-margin training.

Connection to Robust Prediction • RAR min 4 𝔽 𝑚 𝒚, 𝑧; 𝜄 + 𝜇 ∗ RAR RAR = max 𝒚 @ ∈B(𝒚) 𝑡(IG(𝒚, 𝒚′)) R with approximate IG, then RAR • If 𝜇 = 𝜇′/𝜗 R and 𝑡 ⋅ = ⋅ + becomes the Input Gradient Regularization for robust prediction R min 4 𝔽 𝑚 𝒚, 𝑧; 𝜄 + 𝜇′ ∇ 𝒚 𝑚 𝒚, 𝑧; 𝜄 R Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. Andrew Slavin Ross and Finale Doshi-Velez. AAAI 2018.

Discussion •Robust attribution leads to more human-aligned attribution. •Robust attribution may help tackle spurious correlations.

THANK YOU!

Robust Attribution Regularization 2 , Yingyu Liang 1 , Jiefeng Chen - PowerPoint PPT Presentation

Robust Attribution Regularization 2 , Yingyu Liang 1 , Jiefeng Chen 1 , Xi Wu 2 , Vaibhav Rastogi Somesh Jha 1,3 1 University of Wisconsin-Madison 2 Google 3 XaiPient NeurIPS2019 *Equal contribution Work done while at UW-Madison

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Regularization Overview Regularization Overview Problems & Multicollinearity We will

Introduction CSCE 970 CSCE 970 Lecture 3: Lecture 3: Regularization Regularization CSCE 970

Regularization Regularization is a general approach to add a complexity parameter to a

Robust scatter regularization G. Haesbroeck and C. Croux University of Li` ege - University of

CS7015 (Deep Learning) : Lecture 8 Regularization: Bias Variance Tradeoff, l2 regularization,

Regularization via Spectral Filtering Lorenzo Rosasco MIT, 9.520 Class 7 L. Rosasco

Regularization Paths Boosting fits a regularization path toward a max-margin classifier.

LIC-Based Regularization of Multi-Valued Images David Tschumperl CNRS UMR 6072 (GREYC/ENSICAEN)

Regularization of optimal control problems Daniel Wachsmuth (RICAM Linz) joint work with Gerd

Iterative regularization for general inverse problems Guillaume Garrigos with L. Rosasco and S.

10. Regularization More on tradeoffs Regularization Effect of using different norms

Regularization Methods for System Identification Input Design Biqiang MU Academy of Mathematics

Manifold Regularization Lorenzo Rosasco MIT, 9.520 L. Rosasco Manifold Regularization About

Regularization for Multi-Output Learning Lorenzo Rosasco 9.520 L. Rosasco Regularization for

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

On the K onigHall theorem for multidimensional matrices Anna Taranenko Sobolev Institute

Modularization of TF Work W3C WoT Face-to-Face Meeting Dusseldorf, Germany, July 2017

WoT IG and WG Administration WoT F2F Meeting Osaka, Japan, May, 2017 UPCOMING F2F MEETINGS

From Observational Data to Information IG (OD2I IG) Markus Stocker (@envinf) TIB Leibniz

Decision Trees Administrative Homework goes out today, please contact Isaac Tian

Residual categories of Grassmannians Maxim Smirnov University of Augsburg October 1, 2020 based

ActBERT: Learning Global-Local Video-Text Representations Linchao Zhu Self-supervised pretraining

Ramiro Sarabia Demetrius Cooper @ramsarabia on IG, Twtr, @thatsmycheese on IG LinkedIn

Robust Attribution Regularization 2 , Yingyu Liang 1 , Jiefeng Chen - PowerPoint PPT Presentation

Robust Attribution Regularization 2 , Yingyu Liang 1 , Jiefeng Chen *1 , Xi Wu *2 , Vaibhav Rastogi Somesh Jha 1,3 1 University of Wisconsin-Madison 2 Google 3 XaiPient NeurIPS2019 *Equal contribution Work done while at UW-Madison

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Regularization Overview Regularization Overview Problems &amp; Multicollinearity We will

Introduction CSCE 970 CSCE 970 Lecture 3: Lecture 3: Regularization Regularization CSCE 970

Regularization Regularization is a general approach to add a complexity parameter to a

Robust scatter regularization G. Haesbroeck and C. Croux University of Li` ege - University of

CS7015 (Deep Learning) : Lecture 8 Regularization: Bias Variance Tradeoff, l2 regularization,

Regularization via Spectral Filtering Lorenzo Rosasco MIT, 9.520 Class 7 L. Rosasco

Regularization Paths Boosting fits a regularization path toward a max-margin classifier.

LIC-Based Regularization of Multi-Valued Images David Tschumperl CNRS UMR 6072 (GREYC/ENSICAEN)

Regularization of optimal control problems Daniel Wachsmuth (RICAM Linz) joint work with Gerd

Iterative regularization for general inverse problems Guillaume Garrigos with L. Rosasco and S.

10. Regularization More on tradeoffs Regularization Effect of using different norms

Regularization Methods for System Identification Input Design Biqiang MU Academy of Mathematics

Manifold Regularization Lorenzo Rosasco MIT, 9.520 L. Rosasco Manifold Regularization About

Regularization for Multi-Output Learning Lorenzo Rosasco 9.520 L. Rosasco Regularization for

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

On the K onigHall theorem for multidimensional matrices Anna Taranenko Sobolev Institute

Modularization of TF Work W3C WoT Face-to-Face Meeting Dusseldorf, Germany, July 2017

WoT IG and WG Administration WoT F2F Meeting Osaka, Japan, May, 2017 UPCOMING F2F MEETINGS

From Observational Data to Information IG (OD2I IG) Markus Stocker (@envinf) TIB Leibniz

Decision Trees Administrative Homework goes out today, please contact Isaac Tian

Residual categories of Grassmannians Maxim Smirnov University of Augsburg October 1, 2020 based

ActBERT: Learning Global-Local Video-Text Representations Linchao Zhu Self-supervised pretraining

Ramiro Sarabia Demetrius Cooper @ramsarabia on IG, Twtr, @thatsmycheese on IG LinkedIn

Robust Attribution Regularization 2 , Yingyu Liang 1 , Jiefeng Chen 1 , Xi Wu 2 , Vaibhav Rastogi Somesh Jha 1,3 1 University of Wisconsin-Madison 2 Google 3 XaiPient NeurIPS2019 *Equal contribution Work done while at UW-Madison

Regularization Overview Regularization Overview Problems & Multicollinearity We will