The Mythos of Model Interpretability Zachary C. Lipton - PowerPoint PPT Presentation

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490

Outline • What is interpretability ? • What are its desiderata? • What model properties confer interpretability? • Caveats, pitfalls, and takeaways

What is Interpretability? • Many papers make axiomatic claims that some model is interpretable and therefore preferable • But what interpretability is and precisely what desiderata it serves are seldom defined • Does interpretability hold consistent meaning across papers?

Inconsistent Definitions • Papers use the words interpretable, explainable, intelligible, transparent, and understandable , both interchangeably (within papers) and inconsistently (across papers) • One common thread, however, is that interpretability is something other than performance

We want good models Evaluation Metric

We also want interpretable models Evaluation Metric Interpretation

The Human Wants Something the Metric Doesn’t Evaluation Metric Interpretation

What Gives? • So either the metric captures everything and people seeking interpretable models are crazy or… • The metric / loss functions we optimize are fundamentally mismatched from real life objectives • We hope to refine the discourse on interpretability,   introducing more specific language • Through the lens of the literature, we create a taxonomy of both objectives & methods  

Trust • Does the model know   when it’s uncertain? • Does the model make   same mistakes as human? • Are we comfortable   with the model?

Causality • We may want models to   tell us something about   the natural world • Supervised models are   trained simply to make   predictions, but often used to take actions • Caruana (2015) shows a mortality predictor (for use in triage) that assigns lower risk to asthma patients

Transferability • The idealized training   setups often differ from   real world • Real problem may be   non-stationary, noisier,   etc. • Want sanity-checks that   the model doesn’t depend   on weaknesses in setup

Informativeness • We may train a model   to make a decision • But it’s real purpose is   to aid a person in   making a decision • Thus an interpretation   may simply be valuable for the extra bits it carries

Transparency • Proposed solutions conferring interpretability tend to fall into two categories • Transparency addresses understanding how the model works • Explainability concerns the model’s ability to offer some (potentially post-hoc) explanation

Simulatability • One notion of transparency   is simplicity • This accords with papers   advocating small decision trees • A model is transparent if a   person can step through the   algorithm in reasonable time

Decomposability • A relaxed notion requires   understanding individual   components of a model • Such as: weights of a linear   model or the nodes of a   decision tree

Transparent Algorithms • A yet weaker notion   would require only   that we understand the   behavior algorithm • E.g. convergence of   convex optimizations,   generalization bounds  

Post-Hoc Interpretability A h y e s , s o m e t h i n g c o o l i s h a p p e n i n g i n n o d e 7 5 0 , 3 4 5 , 1 6 7 … m a y b e i t s e e s a c a t ?   M a y b e w e ’ l l s e e s o m e t h i n g a w e s o m e i f w e j i g g l e t h e i n p u t s ?

Verbal Explanations • Just as people generate   explanations (absent   transparency), we might   train a (possibly separate)   model to generate   explanations • We might consider image   captions as interpretations   of object predictions   (Image: Karpathy et al 2015)

Saliency Maps • While the full relationship   between input and output   might be impossible to   describe succinctly,   local explanations are   potentially useful. (Image: Wang et al 2016)

Case-Based Explanations • Another way to generate   a post-hoc explanation might   be to retrieve labeled items   that are deemed similar   by the model • For some models, we can   retrieve histories from   similar patients   (Image: Mikolov et al 2014)

Discussion Points • Linear models not strictly more interpretable than deep learning • Claims about interpretability must be qualified • Transparency may be at odds with the goals of AI • Post-hoc interpretations may potentially mislead

Thanks! Acknowledgments: Zachary C. Lipton was supported by the Division of Biomedical Informatics at UCSD, via training grant (T15LM011271) from the NIH/NLM. Thanks to Charles Elkan, Julian McAuley, David Kale, Maggie Makar, Been Kim, Lihong Li, Rich Caruana, Daniel Fried, Jack Berkowitz, & Sepp Hochreiter References: The Mythos of Model Interpretability (ICML Workshop on Human Interpretability 2016) - ZC Lipton http://arxiv.org/abs/1511.03677 Directly Modeling Missing Data with RNNs (MLHC 2016) - ZC Lipton, DC Kale, R Wetzel http://arxiv.org/abs/1606.04130   Learning to Diagnose (ICLR 2016) - ZC Lipton, DC Kale, Charles Elkan, R Wetzel http://arxiv.org/abs/1511.03677 Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. ( 2015) - R Caruana et al   http://dl.acm.org/citation.cfm?id=2788613

The Mythos of Model Interpretability Zachary C. Lipton - PowerPoint PPT Presentation

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline What is interpretability ? What are its desiderata? What model properties confer interpretability? Caveats, pitfalls, and takeaways

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline

The Mythos Project: The Mythos Project: a distance learning experience a distance learning

Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series

INTERPRETABILITY AND INTERPRETABILITY AND EXPLAINABILITY EXPLAINABILITY Christian Kaestner

Model Interpretation Danish Pruthi April 28, 2020 Why interpretability? Task:

Interpretability and functional transparency Tommi Jaakkola in collaboration with David Alvarez

Interpretability of Machine Learning for Computer Vision Xinshuo Weng* *Most slides borrowed

Interpretability in PRA Marta Bilkova , Dick de Jongh , and Joost J. Joosten ,

Evangelion Mythos and the Plot You Thought the Show Forgot Anime Is Lit Podcast Twitter:

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples Guanhong Tao ,

Interpretability and Robustness for Multi-Hop QA Mohit Bansal (MRQA-EMNLP 2019 Workshop) 1

Interpretability and the arithmetized completeness theorem (Taishi Kurahashi)

On the Connection Between Adversarial Robustness and Saliency Map Interpretability Christian

Explaining Machine Learning Models Armen Donigian Director of Data Science Engineering Roadmap

Measuring model performance or error Introduction to Machine Learning Is our model any good?

The Evolution of Tox21: Enhancing Physiological Relevance & Interpretability with Emerging

Interpretability and Visualization of Deep Neural Networks Au Aude Oliva MI MIT Convoluti

Increasing stability and interpretability of gene expression signatures Prediction of breast

Network Dissection: Quantifying Interpretability of Deep Visual Representations By David Bau,

Interpretability in Machine Learning Why Interpret ? The current state of machine learning And

Human-in-the-Loop Interpretability Prior Isaac Lage 1 , Andrew Slavin Ross 1 , Been Kim 2 , Samuel

Interpretability in Convolutional Neural Networks for Building Damage Classification in Satellite

Driverless AI for GPUs Interpretability, Accuracy, Speed Sri Ambati, Arno Candel GTC 2017 Time

Anatomy and Interpretability of Neural Networks Leon Yin ~ Data Scientist | Research Engineer

The Mythos of Model Interpretability Zachary C. Lipton - PowerPoint PPT Presentation

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline What is interpretability ? What are its desiderata? What model properties confer interpretability? Caveats, pitfalls, and takeaways

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline

The Mythos Project: The Mythos Project: a distance learning experience a distance learning

Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series

INTERPRETABILITY AND INTERPRETABILITY AND EXPLAINABILITY EXPLAINABILITY Christian Kaestner

Model Interpretation Danish Pruthi April 28, 2020 Why interpretability? Task:

Interpretability and functional transparency Tommi Jaakkola in collaboration with David Alvarez

Interpretability of Machine Learning for Computer Vision Xinshuo Weng* *Most slides borrowed

Interpretability in PRA Marta Bilkova , Dick de Jongh , and Joost J. Joosten ,

Evangelion Mythos and the Plot You Thought the Show Forgot Anime Is Lit Podcast Twitter:

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples Guanhong Tao ,

Interpretability and Robustness for Multi-Hop QA Mohit Bansal (MRQA-EMNLP 2019 Workshop) 1

Interpretability and the arithmetized completeness theorem (Taishi Kurahashi)

On the Connection Between Adversarial Robustness and Saliency Map Interpretability Christian

Explaining Machine Learning Models Armen Donigian Director of Data Science Engineering Roadmap

Measuring model performance or error Introduction to Machine Learning Is our model any good?

The Evolution of Tox21: Enhancing Physiological Relevance &amp; Interpretability with Emerging

Interpretability and Visualization of Deep Neural Networks Au Aude Oliva MI MIT Convoluti

Increasing stability and interpretability of gene expression signatures Prediction of breast

Network Dissection: Quantifying Interpretability of Deep Visual Representations By David Bau,

Interpretability in Machine Learning Why Interpret ? The current state of machine learning And

Human-in-the-Loop Interpretability Prior Isaac Lage 1 , Andrew Slavin Ross 1 , Been Kim 2 , Samuel

Interpretability in Convolutional Neural Networks for Building Damage Classification in Satellite

Driverless AI for GPUs Interpretability, Accuracy, Speed Sri Ambati, Arno Candel GTC 2017 Time

Anatomy and Interpretability of Neural Networks Leon Yin ~ Data Scientist | Research Engineer

The Evolution of Tox21: Enhancing Physiological Relevance & Interpretability with Emerging