Interpretability in Machine Learning Why Interpret ? The current - PowerPoint PPT Presentation

Interpretability in Machine Learning

Why Interpret ?

The current state of machine learning

And its uses ... https://www.tesla.com/videos/autopilot-self- NYPost MIT Technology Review driving-hardware-neighborhood-long DeepMind DeepMind

So are we in the golden age of AI ?

Safety and well being

Bias in algorithms https://medium.com/@Joy.Buolamwini/response- https://www.infoq.com/presentations/unconscious- racial-and-gender-bias-in-amazon-rekognition- bias-machine-learning/ commercial-ai-system-for-analyzing-faces- a289222eeced

Adversarial Examples

Legal Issues - GDPR

And more ... ● Interactive feedback - can model learn from human actions in online setting ? (Can you tell a model to not repeat a specific mistake ?) ● Recourse – Can a model tell us what actions we can take to change its output ? (For example, what can you do to improve your credit score?)

In general, it seems like there are few fundamental problems – We don’t trust the models ● We don’t know what happens in extreme cases ● Mistakes can be expensive / harmful ● Does the model makes similar mistakes as humans ? ● How to change model when things go wrong ? ● Interpretability is one way we try to deal with these problems

What is interpretability ?

There is no standard definition – Most agree it is something different from performance. Ability to explain or to present a model in understandable terms to humans (Doshi-Velez 2017) Cynical view – It is what makes you feel good about the model. It really depends on target audience.

What does interpretation looks like ? In pre-deep learning models, some models are considered ● “interpretable”

What does interpretation look like ? Heatmap Visualization ● [Jain 2018] [Sundarajan 2017]

What does interpretation looks like ? Give prototypical examples ● [Kim 2016] By Chire - Own work, Public Domain, https://commons.wikimedia.org/w/index.php?curi d=11765684

What does interpretation look like ? Bake it into the model ● [Bastings et al 2019]

What does interpretation looks like ? Provide explanation as text ● [Hancock et al 2018] [Rajani et al 2019]

Some properties of Interpretations Faithfulness - how to provide explanations that accurately represent the true reasoning ● behind the model’s final decision. Plausibility – Is the explanation correct or something we can believe is true, given our ● current knowledge of the problem ? Understandable – Can I put it in terms that end user without in-depth knowledge of the ● system can understand ? Stability – Does similar instances have similar interpretations ? ●

Evaluating Interpretability [Doshi-Velez 2017] Application level evaluation – Put the model in practice and have the ● end users interact with explanations to see if they are useful . Human evaluation – Set up a Mechanical Turk task and ask non- ● experts to judge the explanations Functional evaluation – Design metrics that directly test properties ● of your explanation.

How to “interpret” ? Some definitions

Global vs Local Do we explain individual Do we explain entire model ● ● prediction ? ? Example – Example – Heatmaps Prototypes Rationales Linear Regression Decision Trees

Inherent vs Post-hoc Is the explainability built Is the model black-box and ● ● into the model ? we use external method to try to understand it ? Example – Example – Rationales Linear Regression Heatmaps (Some forms) Decision Trees Prototypes Natural Language Explanations

Model based vs Model Agnostic Can it explain only few Can it explain any model ? ● ● classes of models ? Example – Example – LIME – Locally Interpretable Rationales Model Agnostic Explanations LR / Decision Trees Attention SHAP – Shapley Values Gradients (Differentiable Models only)

Some Locally Interpretable, Post-hoc methods

Saliency Based Methods Heatmap based visualization ● Need differentiable model in most cases ● Normally involve gradient ● Model Model (dog) Explanation Method

[Adebayo et al 2018]

Saliency Example - Gradients 𝑔 𝑦 : 𝑆 𝑒 → 𝑆 𝑦 = 𝑒𝑔(𝑦) 𝐹 𝑔 𝑒𝑦 How do we take gradient with respect to words ? Take gradient with respect to embedding of the word .

Saliency Example – Leave-one-out 𝑔 𝑦 : 𝑆 𝑒 → 𝑆 𝐹(𝑔)(𝑦) 𝑗 = 𝑔 𝑦 − 𝑔(𝑦\i) How to remove ? 1. Zero out pixels in image 2. Remove word from the text 3. Replace the value with population mean in tabular data

Problems with Saliency Maps Only capture first order information ● Strange things can happen to ● heatmaps in second order. [Feng et al 2018]

(Slide Credit – Julius Adebayo)

(Image Credit – Hung-yi Lee) LIME – locally interpretable model agnostic Black 𝑦 1 , 𝑦 2 , ⋯ , 𝑦 𝑂 𝑧 1 , 𝑧 2 , ⋯ , 𝑧 𝑂 Box (e.g. Neural Network) as close as ⋯ ⋯ possible Linear 𝑦 1 , 𝑦 2 , ⋯ , 𝑦 𝑂 𝑧 1 , ෤ 𝑧 2 , ⋯ , ෤ 𝑧 𝑂 ෤ Model Can’t do it globally of course, but locally ? Main Idea behind LIME

Intuition behind LIME [Ribeiro et al 2016]

LIME － Image 1. Given a data point you want to explain ● 2. Sample at the nearby - Each image is represented as a set of ● superpixels (segments). Randomly delete some segments. Black Black Black Compute the probability of “frog” by black box 0.52 0.85 0.01 Ref: https://medium.com/@kstseng/lime-local-interpretable-model-agnostic- (Slide Credit – Hung-yi Lee) explanation%E6%8A%80%E8%A1%93%E4%BB%8B%E7%B4%B9-a67b6c34c3f8

LIME － Image 3. Fit with linear (or interpretable) model ● 𝑦 1 𝑦 𝑛 ⋯ ⋯ ⋯ ⋯ 𝑦 𝑁 Extract Extract Extract 𝑦 𝑛 = ൜0 Segment m is deleted. 1 Linear Linear Linear Segment m exists. 𝑁 is the number of segments. 0.52 0.85 0.01 (Slide Credit – Hung-yi Lee)

LIME － Image 4. Interpret the model you learned ● 𝑧 = 𝑥 1 𝑦 1 + ⋯ + 𝑥 𝑛 𝑦 𝑛 + ⋯ + 𝑥 𝑁 𝑦 𝑁 𝑦 𝑛 = ൜0 Segment m is deleted. 1 Segment m exists. 𝑁 is the number of segments. Extract segment m is not related to “frog” If 𝑥 𝑛 ≈ 0 Linear segment m indicates the image is “frog” If 𝑥 𝑛 is positive segment m indicates the image is not “frog” If 𝑥 𝑛 is negative 0.85 (Slide Credit – Hung-yi Lee)

The Math behind LIME Control Match interpretable complexity of the model to black box model

Example from NLP

Rationalization Models

General Idea Tree frog Extractor Classifier (97%) Positive (98%) Extractor Classifier

(Slides Credit – Tao Lei)

FRESH Model – Faithful Rationale Extraction using Saliency Thresholding

Some Results – Functional Evaluation

Some Results – Human Evaluation

Important Points to take away Interpretability – no consistent definition ● When designing new system, ask your stakeholders what they want ● out of it . See if you can use inherently interpretable model . ● If not, what method can you use to interpret the black box ? ● Ask – does this method make sense ? Question Assumptions !!! ● Stress Test and Evaluate ! ●

Interpretability in Machine Learning Why Interpret ? The current - PowerPoint PPT Presentation

Interpretability in Machine Learning Why Interpret ? The current state of machine learning And its uses ... https://www.tesla.com/videos/autopilot-self- NYPost MIT Technology Review driving-hardware-neighborhood-long DeepMind DeepMind So

Interpretability of Machine Learning for Computer Vision Xinshuo Weng* *Most slides borrowed

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline

INTERPRETABILITY AND INTERPRETABILITY AND EXPLAINABILITY EXPLAINABILITY Christian Kaestner

Interpretability and functional transparency Tommi Jaakkola in collaboration with David Alvarez

Explaining Machine Learning Models Armen Donigian Director of Data Science Engineering Roadmap

Interpretability in PRA Marta Bilkova , Dick de Jongh , and Joost J. Joosten ,

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series

Interpretability and the arithmetized completeness theorem (Taishi Kurahashi)

Interpretability and Robustness for Multi-Hop QA Mohit Bansal (MRQA-EMNLP 2019 Workshop) 1

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

U l t r a f a s t , i n t e n s e l a s e r Consiglio Nazionale p u l s

Spectroscopic Instrumentation Theodor Pribulla Astronomical Institute of the Slovak Academy of

Algorithms: The basic methods Inferring rudimentary rules Statistical modeling Data Mining

Review of Synchrotron Radiation based Diagnostics for Transverse Profile Measurements Gero Kube

Data Mining with Weka Class 3 Lesson 1 Simplicity first! Ian H. Witten Department of Computer

Understanding the biological machinery by cryogenic TEM imaging and structure determination.

Care on Learning and Behavior All Childrens Health Initiative for Eye and Vision Excellence

Display Technology Images stolen from various locations on the web... Cathode Ray Tube 1