Fraunhofer Image Processing Heinrich Hertz Institute Interpretable & Transparent Deep Learning Fraunhofer HHI, Machine Learning Group Wojciech Samek Northern Lights Deep Learning Workshop (NLDL’19) Tromsø, Norway 8th January 2019
Record Performances with ML Safety critical applications Research projects Wojciech Samek: Interpretable & Transparent Deep Learning 2
Need for Interpretability verify understand legal system weaknesses aspects learn new strategies Wojciech Samek: Interpretable & Transparent Deep Learning 3
Need for Interpretability Wojciech Samek: Interpretable & Transparent Deep Learning 4
Naive Approach: Sensitivity Analysis Wojciech Samek: Interpretable & Transparent Deep Learning 5
Naive Approach: Sensitivity Analysis Wojciech Samek: Interpretable & Transparent Deep Learning 6
Better Approach: LRP Black Box Layer-wise Relevance Propagation (LRP) (Bach et al., PLOS ONE, 2015) Wojciech Samek: Interpretable & Transparent Deep Learning 7
Better Approach: LRP Classification cat rooster dog Wojciech Samek: Interpretable & Transparent Deep Learning 8
Better Approach: LRP Classification cat rooster dog Idea: Redistribute the evidence for class What makes this image a “rooster image” ? rooster back to image space. Wojciech Samek: Interpretable & Transparent Deep Learning 9
Better Approach: LRP Theoretical interpretation Deep Taylor Decomposition (Montavon et al., 2017) not based on gradient ! Wojciech Samek: Interpretable & Transparent Deep Learning 10
Better Approach: LRP Explanation cat rooster dog Wojciech Samek: Interpretable & Transparent Deep Learning 11
Better Approach: LRP Heatmap of prediction “3” Heatmap of prediction “9” Wojciech Samek: Interpretable & Transparent Deep Learning 12
Better Approach: LRP More information (Montavon et al., 2017 & 2018) Wojciech Samek: Interpretable & Transparent Deep Learning 13
Decomposing the Correct Quantity Wojciech Samek: Interpretable & Transparent Deep Learning 14
Why Simple Taylor doesn’t work? Wojciech Samek: Interpretable & Transparent Deep Learning 15
Deep Taylor Decomposition Idea : Since neural network is composed of simple functions, we propose a deep Taylor decomposition. Each explanation step: - easy to find good root point - no gradient shattering (Montavon et al., 2017 Montavon et al. 2018) Wojciech Samek: Interpretable & Transparent Deep Learning 16
Deep Taylor Decomposition Output relevance f(x) = R j = a j *1 R j = a j *const R i ≈ a i *const Taylor decomposition compute R i R i = redistribute to lower layer i Wojciech Samek: Interpretable & Transparent Deep Learning 17
Deep Taylor Decomposition how to choose the root point ? Wojciech Samek: Interpretable & Transparent Deep Learning 18
Other Explanation Methods Wojciech Samek: Interpretable & Transparent Deep Learning 19
Axiomatic Approach to Interpretability Wojciech Samek: Interpretable & Transparent Deep Learning 20
Axiomatic Approach to Interpretability Wojciech Samek: Interpretable & Transparent Deep Learning 21
Axiomatic Approach to Interpretability Wojciech Samek: Interpretable & Transparent Deep Learning 22
Axiomatic Approach to Interpretability Wojciech Samek: Interpretable & Transparent Deep Learning 23
LRP revisited General Images (Bach’ 15, Lapuschkin’16) Text Analysis (Arras’16 &17) Speech (Becker’18) Morphing (Seibold’18) Games (Lapuschkin’19) VQA (Arras’18) Video (Anders’18) Gait Patterns (Horst’19) EEG (Sturm’16) Faces (Lapuschkin’17) fMRI (Thomas’18) Digits (Bach’ 15) Histopathology (Binder’18) Wojciech Samek: Interpretable & Transparent Deep Learning 24
LRP applied to different Models Convolutional NNs (Bach’15, Arras’17 …) Local Renormalization LSTM (Arras’17, Thomas’18) Layers (Binder’16) Bag-of-words / Fisher Vector models (Bach’15, Arras’16, Lapuschkin’17, Binder’18) One-class SVM (Kauffmann’18) Wojciech Samek: Interpretable & Transparent Deep Learning 25
Application: Compare Classifiers word2vec/CNN : Performance: 80.19% Strategy to solve the problem: identify semantically meaningful words related to the topic. BoW/SVM : Performance: 80.10% Strategy to solve the problem: identify statistical patterns, i.e., use word statistics (Arras et al. 2016 & 2017) Wojciech Samek: Interpretable & Transparent Deep Learning 26
Application: Compare Classifiers same performance —> same strategy ? (Lapuschkin et al. 2016) Wojciech Samek: Interpretable & Transparent Deep Learning 27
Application: Compare Classifiers ‘horse’ images in PASCAL VOC 2007 Wojciech Samek: Interpretable & Transparent Deep Learning 28
Application: Measure Context Use how important how important is context ? is context ? classifier relevance outside bbox importance = of context relevance inside bbox Wojciech Samek: Interpretable & Transparent Deep Learning 29
Application: Measure Context Use (Lapuschkin et al., 2016) Wojciech Samek: Interpretable & Transparent Deep Learning 30
Application: Measure Context Use BVLC CaffeNet Context use Context use anti-correlated GoogleNet with performance. VGG CNN S BVLC CaffeNet GoogleNet VGG CNN S (Lapuschkin et al. 2016) Wojciech Samek: Interpretable & Transparent Deep Learning 31
Application: Face analysis Gender classification with pretraining without pretraining Strategy to solve the problem: Focus on chin / beard, eyes & hear, but without pretraining the model overfits (Lapuschkin et al., 2017) Wojciech Samek: Interpretable & Transparent Deep Learning 32
Application: Face analysis Age classification Predictions 25-32 years old Strategy to solve the problem: Focus on the laughing … 60+ years old laughing speaks against 60+ (i.e., model learned that old people do not laugh) pretraining on ImageNet pretraining on IMDB-WIKI (Lapuschkin et al., 2017) Wojciech Samek: Interpretable & Transparent Deep Learning 33
Application: EEG Analysis Brain-Computer How brain works subject-dependent Interfacing —> individual explanations explain CNN LRP DNN (Sturm et al. 2016) Wojciech Samek: Interpretable & Transparent Deep Learning 34
Application: EEG Analysis With LRP we can analyze what made a trial being misclassified. (Sturm et al. 2016) Wojciech Samek: Interpretable & Transparent Deep Learning 35
Application: Sentiment analysis How to handle multiplicative interactions ? gate neuron indirectly affect relevance distribution in forward pass Negative sentiment Positive sentiment (Arras et al., 2017) Wojciech Samek: Interpretable & Transparent Deep Learning 36
Application: fMRI Analysis (Thomas et al. 2018) Wojciech Samek: Interpretable & Transparent Deep Learning 37
Application: Gait Analysis I: Record gait data II: Predict with DNN measured gait features "Subject 6" x f(x)= s e x f(x)= l g n A t n i o J y Ground Reaction Force d o B - r e gait feature relevance w o L III: Explain using LRP Colour Spectrum for Relevance Visualisation Time R 0 R 0 R 0 Our approach: - Classify & explain individual gait patterns - Important for understanding diseases such as Parkinson (Horst et al. 2018) Wojciech Samek: Interpretable & Transparent Deep Learning 38
Application: Understand the model model understands the question and correctly identifies the object of interest (Arras et al., 2018) Wojciech Samek: Interpretable & Transparent Deep Learning 39
Application: Understand the model (Anders et al., 2018) Wojciech Samek: Interpretable & Transparent Deep Learning 40
Application: Understand the model Observation : Explanations focus on the bordering of the video, as if it wants to watch more of it. Wojciech Samek: Interpretable & Transparent Deep Learning 41
Application: Understand the model Idea : Play video in fast forward (without retraining) and then the classification accuracy improves. Wojciech Samek: Interpretable & Transparent Deep Learning 42
Application: Understand the model female speaker male speaker model classifies gender based on the fundamental frequency and its immediate harmonics (see also Traunmüller & Eriksson 1995) (Becker et al., 2018) Wojciech Samek: Interpretable & Transparent Deep Learning 43
Application: Understand the model (Lapuschkin et al., in prep.) Wojciech Samek: Interpretable & Transparent Deep Learning 44
Application: Understand the model (Lapuschkin et al., in prep.) Wojciech Samek: Interpretable & Transparent Deep Learning 45
Application: Understand the model model learns 1. track the ball 2. focus on paddle 3. focus on the tunnel (Lapuschkin et al., in prep.) Wojciech Samek: Interpretable & Transparent Deep Learning 46
Recommend
More recommend