Bias Also Matters: Bias Attribution for Deep Neural Network - PowerPoint PPT Presentation

Bias Also Matters: Bias Attribution for Deep Neural Network Explanation Shengjie Wang*, Tianyi Zhou*, Jeff A. Bilmes University of Washington, Seattle

Explain DNNs as a linear model per data point • DNN with piecewise linear activations like ReLU, when applied to a data point 𝑦 , equals to a linear model 𝑕 𝑦 = 𝑥𝑦 + 𝑐. D D • The gradient term, i.e., 𝑥 in 𝑕 𝑦 , has been widely studied to explain DNN output on a given data point. • The bias 𝑐 , however, is usually overlooked.

Bias contains important information of DNNs • Decomposition of a DNN for every data point x : f ( x ) = W m m − 1 ( W m − 1 m − 2 ( . . . 1 ( W 1 x + b 1 ) . . . ) + b m − 1 ) + b m . and are the weight matrix and bias term for layer , is the corresponding B D • The bias term, though as a scalar, results from the complicated process involving both the weights and biases of DNN layers.

Bias is important for DNN performance • Linear model with gradient term only may produce wrong predictions. • The bias term corrects it. Dataset Train Without Bias Train With Bias, Test All Test Only wx Test Only b CIFAR10 87.0 90.9 71.5 62.2 CIFAR100 62.8 66.8 40.3 36.5 FMNIST 94.1 94.7 76.1 24.6 Our method “Bias Backpropagation (BBp)” explicitly attributes the bias term to each input feature.

Bias Backpropagation (BBp) Algorithm 1 Bias Backpropagation (BBp) • Start from the final layer and attribute of input : x , { W ` } m ` =1 , { b ` } m ` =1 , { ` ( · ) } m the bias in a backpropagation style. ` =1 1 Compute { W x ` } m ` =1 and { b x ` } m ` =1 for x by Eq. (5) ; // Get • For every layer: data point specific weight/bias (14) 2 � m ← b m ; // � ` holds the accumulated attribution for • Receive the bias attribution from layer ` the previous layer. 3 for ` ← m to 2 by − 1 do for p ← 1 to d ` by 1 do 4 • Combine the received bias Compute ↵ ` [ p ] by Eq. (15) - (17) or Eq. (18) ; i.e., 5 attribution with the effective bias ⇤ // Compute attribution score . B ` [ p, q ] ← ↵ ` [ p, q ] × � ` [ p ] , ∀ q ∈ [ d ` − 1 ] ; of this layer. 6 // Attribute to the layer input • Attribute the combined term to the end 7 input of this layer. for q ← 1 to d ` − 1 by 1 do 8 Q m ` − 1 + P d ` i = ` W x i b x � ` − 1 [ q ] ← p =1 B ` [ p, q ] ; 9 • The sum of attribution on all input // Combine with bias of layer ` − 1 features exactly recovers 𝑐 𝑦 . end 10 11 end 12 return � 1 ∈ R d in

Examples of Attribution Results on Images norm. integrad. norm. bias.1. norm. bias.2. norm. bias.3. norm. grad. original label integrad. attrib. bias.1 attrib. bias.2 attrib. bias.3 attrib. grad. attrib. Teddy Bear Brambling Longhorn Beetle Fire- guard Folding Chair Fountain Pen Piggy Bank

all except all except all except original all layers first 2 layers first 4 layers first 6 layers Bias Attribution of bias.1. attrib various layers bias.2. attrib bias.3. • We can use BBp to analyze attrib biases of different layers. bias.1. attrib • Bias from lower layers results bias.2. in more noise in the attribution. attrib • Bias from deeper layer reveals bias.3. attrib high-level features (e.g., head bias.1. parts of the dog and the bird). attrib bias.2. attrib bias.3. attrib “bias.1(2,3)” corresponds to the three variants of BBp.

Quantitative evaluation on MNIST digit flip test • Mask input image pixels based on the attribution scores. • Check the change of the predictions. • Log-odds scores of target vs. source class before and after masking pixels. • BBp is class-sensitive and comparable to methods such as integrated gradient and DeepLift.

Thank you! • For more details, please come to our poster session Wednesday 06:30 - 09:00 PM Pacific Ballroom #147

Bias Also Matters: Bias Attribution for Deep Neural Network - PowerPoint PPT Presentation

Bias Also Matters: Bias Attribution for Deep Neural Network Explanation Shengjie Wang, Tianyi Zhou, Jeff A. Bilmes University of Washington, Seattle Explain DNNs as a linear model per data point DNN with piecewise linear activations like

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

S et the Bar Low. Be a WINNER every time. Public Power Matters Public Power Matters Innovation

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Rational Phosphorus Rational Phosphorus Management in Biosolids Management in Biosolids

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

Estimating and Mitigating Gender Bias in Deep Image Representations Tianlu Wang University of

The Fundamentals of Deep Learning Building Blocks Theory with Applications Neural Units Neural

Deep Learning with Neural Networks The Structure and Optimization of Deep Neural Networks Allan

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Course Introduction 17-654/17-765 Analysis of Software Artifacts Jonathan Aldrich Why is

Computing Systems Wei Tang*, Narayan Desai # , Venkatram Vishwanarth# Daniel Buettner#, Zhiling

Functional Dependencies and Normalization There are many forms of constraints on relational

What we can do and how? Lingyun Meng, Ph.D., Associate Professor State Key Laboratory of Rail

Wrapup CSE443 - Spring 2012 Introduction to Computer and Network Security Professor Jaeger

Q2 2018 Management Commentary August 1, 2018 NYSE: DVN devonenergy.com Exe xecu cuting ting

Roadmap for Section 2.3. Environment Subsystems System Service Dispatching Windows on Windows -

ANALYSIS OF THE INFLUENCE OF THE URBAN DRAINAGE SYSTEM IN THE AREA OF THE POOL FORMED AFTER

Bias Also Matters: Bias Attribution for Deep Neural Network - PowerPoint PPT Presentation

Bias Also Matters: Bias Attribution for Deep Neural Network Explanation Shengjie Wang*, Tianyi Zhou*, Jeff A. Bilmes University of Washington, Seattle Explain DNNs as a linear model per data point DNN with piecewise linear activations like

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

BIAS BIAS LIGHT LIGHT &amp; &amp; MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

S et the Bar Low. Be a WINNER every time. Public Power Matters Public Power Matters Innovation

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Rational Phosphorus Rational Phosphorus Management in Biosolids Management in Biosolids

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

Estimating and Mitigating Gender Bias in Deep Image Representations Tianlu Wang University of

The Fundamentals of Deep Learning Building Blocks Theory with Applications Neural Units Neural

Deep Learning with Neural Networks The Structure and Optimization of Deep Neural Networks Allan

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Course Introduction 17-654/17-765 Analysis of Software Artifacts Jonathan Aldrich Why is

Computing Systems Wei Tang*, Narayan Desai # , Venkatram Vishwanarth# Daniel Buettner#, Zhiling

Functional Dependencies and Normalization There are many forms of constraints on relational

What we can do and how? Lingyun Meng, Ph.D., Associate Professor State Key Laboratory of Rail

Wrapup CSE443 - Spring 2012 Introduction to Computer and Network Security Professor Jaeger

Q2 2018 Management Commentary August 1, 2018 NYSE: DVN devonenergy.com Exe xecu cuting ting

Roadmap for Section 2.3. Environment Subsystems System Service Dispatching Windows on Windows -

ANALYSIS OF THE INFLUENCE OF THE URBAN DRAINAGE SYSTEM IN THE AREA OF THE POOL FORMED AFTER

Bias Also Matters: Bias Attribution for Deep Neural Network Explanation Shengjie Wang, Tianyi Zhou, Jeff A. Bilmes University of Washington, Seattle Explain DNNs as a linear model per data point DNN with piecewise linear activations like

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh