Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series Oct 10th, 2019 Work done in collaboration with Philipp Koehn and Hainan Xu
Outline • A Quick Tour of Interpretability • Model Transparency • Post-hoc Interpretations • Moving Visual Interpretability to Language: • Word Alignment for NMT Via Model Interpretation • Benchmarking Interpretations Via Lexical Agreement • Future Work Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 2
Outline • A Quick Tour of Interpretability • Model Transparency • Post-hoc Interpretations • Moving Visual Interpretability to Language: • Word Alignment for NMT Via Model Interpretation • Benchmarking Interpretations Via Lexical Agreement • Future Work Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 3
What is Interpretability? • No consensus! • Categorization proposed in [Lipton 2018] • Model Transparency • Post-hoc Interpretation Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 5
Toy Example Speaker Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 6
Toy Example Speaker ? ? ? Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 7
A Transparent Model Speaker 1 2 3 4 Amplifier Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 8
Transparent Models • Build another model that accomplishes the same task , but with easily explainable behaviors • Deep neural networks are not interpretable… • So what models are? (Open question) • log-linear model? • attention model? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 9
Meh. Too lazy for that! Speaker ? ? ? Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 10
Post-hoc Interpretation • Ask a human • Interpretation with stand-alone model (different task!) • Jiggle the cable! • Interpretation with sensitivity w.r.t. features Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 11
Post-hoc Interpretation • Ask a human • Interpretation with stand-alone model (different task!) • Jiggle the cable! • Interpretation with sensitivity w.r.t. features Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 12
A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 13
A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 14
A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 15
A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 16
Relative Sensitivity…? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 17
Relative Sensitivity…? when : Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 18
Saliency Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 19
What’s good about this? 1. Model-agnostic , and yet with some exposure to the interpreted model 2. Derivatives are easy to obtain for any DL toolkit Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 20
Saliency in Computer Vision Image Saliency https://pair-code.github.io/saliency/ Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 21
SmoothGrad • Gradients are very local measure of sensitivity. • Highly non-linear models may have pathological points where the gradients are noisy . [Smilkov et al. 2017] Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 22
SmoothGrad Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 23
SmoothGrad Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 24
SmoothGrad • Solution: calculate saliency for multiple copies of the same input corrupted with gaussian noise , and average the saliency of copies. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 25
SmoothGrad Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 26
SmoothGrad in Computer Vision Original Image Vanilla SmoothGrad Integrated Gradients https://pair-code.github.io/saliency/ Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 27
Integrated Gradients (IG) • Proposed to solve feature saturation • Baseline : an input that carries no information • Compute gradients on interpolated baseline & input and average by integration [Sundararajan et al. 2017] Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 28
IG in Computer Vision Original Image Vanilla SmoothGrad Integrated Gradients https://pair-code.github.io/saliency/ Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 29
Summary Speaker Speaker Game Game TV Box Computer TV Box Computer CD CD Console Console Model Transparency : Post-hoc interpretation : Build model that operates in Keep the original model intact • • an explainable way Interpretation depends on • Interpretation does not specific output • depend on output Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 30
Summary • How is this related to what I’m talking about next? • Word Alignment for NMT Via Model Interpretation • transparent models vs. post-hoc interpretations • Benchmarking Interpretations Via Lexical Agreement • different post-hoc interpretation methods Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 31
Outline • A Quick Tour of Interpretability • Model Transparency • Post-hoc Interpretations • Moving Visual Interpretability to Language: • Word Alignment for NMT Via Model Interpretation • Benchmarking Interpretations Via Lexical Agreement • Future Work Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 32
Word Alignment Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 33
Word Alignment Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 34
Model Transparency? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 35
Model Transparency? Wait… word alignments should be aware of the output! Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 36
Post-hoc Interpretations with Stand-alone Models? p(a ij | e, f) Hint: GIZA++, fast-align, etc. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 37
Post-hoc Interpretations with Perturbation/Sensitivity? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 38
Post-hoc Interpretations with Perturbation/Sensitivity? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 39
“Feature” in Computer Vision Photo Credit: Hainan Xu Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 40
“Feature” in NLP It’s straight-forward to compute saliency for a single dimension of the word embedding. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 41
“Feature” in NLP But how to compose the saliency of each dimension into the saliency of a word ? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 42
Li et al. 2016 Visualizing and Understanding Neural Models in NLP N 1 ∂ y ∑ ∂ e i N i =1 range: (0, ∞ ) Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 43
Our Proposal Consider word embedding look-up as a dot product between the embedding matrix and an one-hot vector . Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 44
Our Proposal The 1 in the one-hot vector denotes the identity of the input word . Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 45
Our Proposal Let’s perturb that 1 like a real value ! i.e. take gradients with regard to the 1 . Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 46
Our Proposal e i ⋅ ∂ y ∑ ∂ e i i range: ( −∞ , ∞ ) N 1 ∂ y ∑ Recall this is different from Li’s proposal: N ∂ e i i =1 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 47
Why is this proposal better? • A input word may strongly discourage certain translation and still carry a large (negative) gradient . • Those are salient words, but shouldn’t be aligned . • Absolute value/L2-norm falls into this pit. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 48
Evaluation • Evaluation of interpretations is tricky ! • Fortunately, there’s human judgments to rely on. • Need to do force decoding with NMT model. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 49
Setup • Architecture: Convolutional S2S, LSTM, Transformer (with fairseq default hyper- parameters) • Dataset: Following Zenkel et al. [2019], which covers de-en , fr-en and ro-en . • SmoothGrad hyper-parameters: N=30 and σ =0.15 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 50
Recommend
More recommend