interpretability in nlp
play

Interpretability in NLP: Moving Beyond Vision Shuoyang Ding - PowerPoint PPT Presentation

Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series Oct 10th, 2019 Work done in collaboration with Philipp Koehn and Hainan Xu Outline A Quick Tour of Interpretability Model Transparency


  1. Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series Oct 10th, 2019 Work done in collaboration with Philipp Koehn and Hainan Xu

  2. Outline • A Quick Tour of Interpretability • Model Transparency • Post-hoc Interpretations • Moving Visual Interpretability to Language: • Word Alignment for NMT Via Model Interpretation • Benchmarking Interpretations Via Lexical Agreement • Future Work 
 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 2

  3. Outline • A Quick Tour of Interpretability • Model Transparency • Post-hoc Interpretations • Moving Visual Interpretability to Language: • Word Alignment for NMT Via Model Interpretation • Benchmarking Interpretations Via Lexical Agreement • Future Work 
 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 3

  4. What is Interpretability? • No consensus! • Categorization proposed in [Lipton 2018] • Model Transparency • Post-hoc Interpretation Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 5

  5. Toy Example Speaker Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 6

  6. Toy Example Speaker ? ? ? Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 7

  7. A Transparent Model Speaker 1 2 3 4 Amplifier Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 8

  8. Transparent Models • Build another model that accomplishes the same task , but with easily explainable behaviors • Deep neural networks are not interpretable… • So what models are? (Open question) • log-linear model? • attention model? 
 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 9

  9. Meh. Too lazy for that! Speaker ? ? ? Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 10

  10. Post-hoc Interpretation • Ask a human • Interpretation with stand-alone model (different task!) • Jiggle the cable! • Interpretation with sensitivity w.r.t. features Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 11

  11. Post-hoc Interpretation • Ask a human • Interpretation with stand-alone model (different task!) • Jiggle the cable! • Interpretation with sensitivity w.r.t. features Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 12

  12. A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 13

  13. A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 14

  14. A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 15

  15. A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 16

  16. Relative Sensitivity…? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 17

  17. Relative Sensitivity…? when : Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 18

  18. Saliency Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 19

  19. What’s good about this? 1. Model-agnostic , and yet with some exposure to the interpreted model 2. Derivatives are easy to obtain for any DL toolkit Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 20

  20. Saliency in Computer Vision Image Saliency https://pair-code.github.io/saliency/ Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 21

  21. SmoothGrad • Gradients are very local measure of sensitivity. • Highly non-linear models may have pathological points where the gradients are noisy . 
 [Smilkov et al. 2017] Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 22

  22. SmoothGrad Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 23

  23. SmoothGrad Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 24

  24. SmoothGrad • Solution: calculate saliency for multiple copies of the same input corrupted with gaussian noise , and average the saliency of copies. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 25

  25. SmoothGrad Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 26

  26. SmoothGrad in Computer Vision Original Image Vanilla SmoothGrad Integrated Gradients https://pair-code.github.io/saliency/ Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 27

  27. Integrated Gradients (IG) • Proposed to solve 
 feature saturation • Baseline : an input that carries no information • Compute gradients on interpolated baseline & input and average by integration [Sundararajan et al. 2017] Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 28

  28. IG in Computer Vision Original Image Vanilla SmoothGrad Integrated Gradients https://pair-code.github.io/saliency/ Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 29

  29. Summary Speaker Speaker Game Game TV Box Computer TV Box Computer CD CD Console Console Model Transparency : Post-hoc interpretation : Build model that operates in Keep the original model intact • • an explainable way Interpretation depends on • Interpretation does not specific output • depend on output Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 30

  30. Summary • How is this related to what I’m talking about next? • Word Alignment for NMT Via Model Interpretation • transparent models vs. post-hoc interpretations • Benchmarking Interpretations Via Lexical Agreement • different post-hoc interpretation methods 
 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 31

  31. Outline • A Quick Tour of Interpretability • Model Transparency • Post-hoc Interpretations • Moving Visual Interpretability to Language: • Word Alignment for NMT Via Model Interpretation • Benchmarking Interpretations Via Lexical Agreement • Future Work 
 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 32

  32. Word Alignment Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 33

  33. Word Alignment Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 34

  34. Model Transparency? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 35

  35. Model Transparency? Wait… word alignments should be aware of the output! Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 36

  36. Post-hoc Interpretations with Stand-alone Models? p(a ij | e, f) Hint: GIZA++, fast-align, etc. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 37

  37. Post-hoc Interpretations with Perturbation/Sensitivity? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 38

  38. Post-hoc Interpretations with Perturbation/Sensitivity? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 39

  39. “Feature” in Computer Vision Photo Credit: Hainan Xu Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 40

  40. “Feature” in NLP It’s straight-forward to compute saliency for 
 a single dimension of the word embedding. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 41

  41. “Feature” in NLP But how to compose the saliency of each dimension into the saliency of a word ? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 42

  42. Li et al. 2016 Visualizing and Understanding Neural Models in NLP N 1 ∂ y ∑ ∂ e i N i =1 range: (0, ∞ ) Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 43

  43. Our Proposal Consider word embedding look-up as a dot product between the embedding matrix and an one-hot vector . Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 44

  44. Our Proposal The 1 in the one-hot vector denotes the identity of the input word . Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 45

  45. Our Proposal Let’s perturb that 1 like a real value ! i.e. take gradients with regard to the 1 . Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 46

  46. Our Proposal e i ⋅ ∂ y ∑ ∂ e i i range: ( −∞ , ∞ ) N 1 ∂ y ∑ Recall this is different from Li’s proposal: N ∂ e i i =1 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 47

  47. Why is this proposal better? • A input word may strongly discourage certain translation and still carry a large (negative) gradient . • Those are salient words, but shouldn’t be aligned . • Absolute value/L2-norm falls into this pit. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 48

  48. Evaluation • Evaluation of interpretations is tricky ! • Fortunately, there’s human judgments to rely on. • Need to do force decoding with NMT model. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 49

  49. Setup • Architecture: Convolutional S2S, LSTM, Transformer (with fairseq default hyper- parameters) • Dataset: Following Zenkel et al. [2019], which covers de-en , fr-en and ro-en . • SmoothGrad hyper-parameters: N=30 and σ =0.15 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 50

Recommend


More recommend