semantically equivalent adversarial rules for debugging
play

Semantically Equivalent Adversarial Rules for Debugging NLP Models - PowerPoint PPT Presentation

Semantically Equivalent Adversarial Rules for Debugging NLP Models Marco Tulio Ribeiro Carlos Guestrin Sameer Singh (UC Irvine) 1 NLP / ML models are getting smarter: VQA What type of road sign is shown? > STOP . Visual7A [Zhu et al


  1. Semantically Equivalent Adversarial Rules for Debugging NLP Models Marco Tulio Ribeiro Carlos Guestrin Sameer Singh (UC Irvine) 1

  2. NLP / ML models are getting smarter: VQA What type of road sign is shown? > STOP . Visual7A [Zhu et al 2016] 2

  3. NLP / ML models are getting smarter: MC (SQuAD) The biggest city on the river Rhine is How long is the Rhine? Cologne, Germany with a population of more than 1,050,000 people. 1230km It is the second-longest river in Central and Western Europe (after BiDAF [Seo et al 2017] the Danube), at about 1,230 km (760 mi) Question: are they prone to oversensitivity? 3

  4. Oversensitivity in images “panda” “gibbon” 57.7% confidence 99.3% confidence Adversaries are indistinguishable to humans… But unlikely in the real world (except for attacks) 4

  5. Adversarial examples Find closest example with different prediction 5

  6. What about text? What type of road sign is shown? > STOP . What type of road sign is What type of road sign is shown? sho wn? Perceptible by humans, unlikely in real world 6

  7. What about text? What type of road sign is shown? > STOP . What type of road sign is shown? A single word changes too much! 7

  8. Semantics matter What type of road sign is shown? > STOP . What type of road sign is shown? > Do not Enter. Bug, and likely in the real world 8

  9. Semantics matter How long is the Rhine? > 1230km The biggest city on the river Rhine is Cologne, Germany with a population of more than 1,050,000 people. It is the second-longest river in How long is the Rhine? Central and Western Europe (after the Danube), at about 1,230 km (760 mi) > More than 1,050,000 Not all changes are the same: semantics matter 9

  10. Adversarial Rules Find rule that generates many adversaries 10

  11. Generalizing adversaries What type of road sign is shown? > STOP . What type of road sign is shown? > Do not Enter. - flips 3.9% of examples Rule What NOUN Which NOUN 11

  12. Semantics matter What color is the sky? > Blue. What color is the sky? > Gray. - flips 3.9% of examples Rule What NOUN Which NOUN 12

  13. Semantics matter How long is the Rhine? > 1230km The biggest city on the river Rhine is Cologne, Germany with a population of more than 1,050,000 people. It is the second-longest river in Central and Western Europe (after How long is the Rhine? the Danube), at about 1,230 km (760 mi) > More than 1,050,000 - flips 3% of examples Rule ? ?? 13

  14. Semantics matter What is the oncorhynchus also called? > chum salmon Detailed investigation of chum salmon, Oncorhynchus keta, showed that these fish digest ctenophores 20 times as fast as an equal weight of shrimps. What is the oncorhynchus also called? > Oncorhynchus keta - flips 3% of examples Rule ? ?? 14

  15. Adversarial Rules Rules are global and actionable, more interesting than individual adversaries 15

  16. Semantically Equivalent Adversary (SEA) 16

  17. Ingredients Semantic score function A black box model Different Semantically prediction Equivalent 17

  18. Revisiting adversaries Find closest example with different prediction 18

  19. Semantic Similarity: Paraphrasing [Mallinson et al, 2017] Portuguese pt - en en - pt Bom filme! Translation Good movie! Sentence X fr - en en - fr French Bon film! Score Translation Translators Back translators Good movie Good film Great movie Language … Movie good model 0.35 0.34 comes for free 0.1 … 19 0.001

  20. Finding an adversary What color is the tray? Pink What colour is the tray? Green Which color is the tray? Green What color is it? Green What color is tray? Pink How color is the tray? Green 20

  21. Semantically Equivalent Adversarial Rules (SEARs) 21

  22. From SEAs to Rules Select Propose Small Find SEAs Candidate Rules Rule Set 22

  23. Proposing Candidate Rules What type of road What type of road sign is shown? sign is shown? Candidate Rules: Exact Match (What → Which) (What type → Which type) What Which type of road sign is shown? (What NOUN → Which NOUN) What Which is the person looking at? Context (WP type → Which type) What Which was I thinking? (WP NOUN → Which NOUN) … Must not change semantics POS Tags 23

  24. From SEAs to Rules Select Propose Small Find SEAs Candidate Rules Rule Set 24

  25. Semantically Equivalent Adversarial Rules (SEARS) High Adversary Count Non-Redundancy What type → Which type color → colour Selected Rules What NOUN → Which NOUN Induces many flipped predictions Flips different predictions 25

  26. Examples: VQA Visual7a-Telling [Zhu et al 2016] 26

  27. Examples: Machine Comprehension BiDAF [Seo et al 2017] 27

  28. Examples: Movie Review Sentiment Analysis FastText [Joulin et al 2016] 28

  29. Experiments 29

  30. 1. SEAs vs Humans 30

  31. Set up SEA (top 5) + Human Humans Top scored SEA Evaluate adversaries for semantic equivalence 31

  32. How often can SEAs be produced? Sentiment Analysis Visual Question Answering 45 60 46 SEAs + Humans better than Humans SEAs find equivalent adversaries as often as Humans 36 33.6 51.3 34.5 42.5 23 33 33.8 11.5 26 25.3 25 0 Human SEA Human + SEA Human SEA Human + SEA Human SEA Human + SEA Human SEA Human + SEA 32

  33. Humans produce different adversaries: Humans did not produce these: What kind of meat is on the boy’s plate? They are so easy to love… But they did produce these: How many suitcases? Also great directing and photography 33

  34. 2. SEARs vs Experts 34

  35. Part 1: experts come up with rules Objective: maximize mistakes with good rules 35

  36. Part 2: experts evaluate our SEARs Experts only accept good rules 36

  37. Results Time (minutes) % correct predictions flipped 20 16 14.2 16.9 15 12 10.9 12.9 10.1 10 8 5.4 5 4 3.3 3 0 0 Visual QA Sentiment Visual QA Sentiment Finding Rules Evaluating SEARs Experts SEARs 37

  38. 3. Fixing bugs 38

  39. Closing the loop (color → colour) (WP VBZ → WP’s) … Filter out bad rules Retrain model Data Augment training 39

  40. Results Fix bugs, no loss in accuracy 14 % of flips due to bugs 12.6 12.6 10.5 7 3.4 3.5 1.4 0 Visual QA Sentiment Original Augmented 40

  41. Conclusion SEARS SEA Semantics matter Models are prone to these bugs SEAs and SEARs help find and fix them 41

  42. Semantically Equivalent Adversarial Rules for Debugging NLP Models Marco Tulio Ribeiro Carlos Guestrin Sameer Singh (UC Irvine) 42

  43. Semantic scoring is still a research problem… Also: inaccurate for long texts 43

  44. Problem: not comparable across instances good movie good good Good movie great Good film excellent 1 1 Great movie … 0.97 0.29 … 0.29 0.07 … … 0.35 0.7 0.34 0.2 0.1 0.05 … … 44

  45. Examples: VQA 45

  46. Examples: Movie Review Sentiment Analysis FastText [Joulin et al 2016] 46

  47. 47

Recommend


More recommend