Semantically Equivalent Adversarial Rules for Debugging NLP Models Marco Tulio Ribeiro Carlos Guestrin Sameer Singh (UC Irvine) 1
NLP / ML models are getting smarter: VQA What type of road sign is shown? > STOP . Visual7A [Zhu et al 2016] 2
NLP / ML models are getting smarter: MC (SQuAD) The biggest city on the river Rhine is How long is the Rhine? Cologne, Germany with a population of more than 1,050,000 people. 1230km It is the second-longest river in Central and Western Europe (after BiDAF [Seo et al 2017] the Danube), at about 1,230 km (760 mi) Question: are they prone to oversensitivity? 3
Oversensitivity in images “panda” “gibbon” 57.7% confidence 99.3% confidence Adversaries are indistinguishable to humans… But unlikely in the real world (except for attacks) 4
Adversarial examples Find closest example with different prediction 5
What about text? What type of road sign is shown? > STOP . What type of road sign is What type of road sign is shown? sho wn? Perceptible by humans, unlikely in real world 6
What about text? What type of road sign is shown? > STOP . What type of road sign is shown? A single word changes too much! 7
Semantics matter What type of road sign is shown? > STOP . What type of road sign is shown? > Do not Enter. Bug, and likely in the real world 8
Semantics matter How long is the Rhine? > 1230km The biggest city on the river Rhine is Cologne, Germany with a population of more than 1,050,000 people. It is the second-longest river in How long is the Rhine? Central and Western Europe (after the Danube), at about 1,230 km (760 mi) > More than 1,050,000 Not all changes are the same: semantics matter 9
Adversarial Rules Find rule that generates many adversaries 10
Generalizing adversaries What type of road sign is shown? > STOP . What type of road sign is shown? > Do not Enter. - flips 3.9% of examples Rule What NOUN Which NOUN 11
Semantics matter What color is the sky? > Blue. What color is the sky? > Gray. - flips 3.9% of examples Rule What NOUN Which NOUN 12
Semantics matter How long is the Rhine? > 1230km The biggest city on the river Rhine is Cologne, Germany with a population of more than 1,050,000 people. It is the second-longest river in Central and Western Europe (after How long is the Rhine? the Danube), at about 1,230 km (760 mi) > More than 1,050,000 - flips 3% of examples Rule ? ?? 13
Semantics matter What is the oncorhynchus also called? > chum salmon Detailed investigation of chum salmon, Oncorhynchus keta, showed that these fish digest ctenophores 20 times as fast as an equal weight of shrimps. What is the oncorhynchus also called? > Oncorhynchus keta - flips 3% of examples Rule ? ?? 14
Adversarial Rules Rules are global and actionable, more interesting than individual adversaries 15
Semantically Equivalent Adversary (SEA) 16
Ingredients Semantic score function A black box model Different Semantically prediction Equivalent 17
Revisiting adversaries Find closest example with different prediction 18
Semantic Similarity: Paraphrasing [Mallinson et al, 2017] Portuguese pt - en en - pt Bom filme! Translation Good movie! Sentence X fr - en en - fr French Bon film! Score Translation Translators Back translators Good movie Good film Great movie Language … Movie good model 0.35 0.34 comes for free 0.1 … 19 0.001
Finding an adversary What color is the tray? Pink What colour is the tray? Green Which color is the tray? Green What color is it? Green What color is tray? Pink How color is the tray? Green 20
Semantically Equivalent Adversarial Rules (SEARs) 21
From SEAs to Rules Select Propose Small Find SEAs Candidate Rules Rule Set 22
Proposing Candidate Rules What type of road What type of road sign is shown? sign is shown? Candidate Rules: Exact Match (What → Which) (What type → Which type) What Which type of road sign is shown? (What NOUN → Which NOUN) What Which is the person looking at? Context (WP type → Which type) What Which was I thinking? (WP NOUN → Which NOUN) … Must not change semantics POS Tags 23
From SEAs to Rules Select Propose Small Find SEAs Candidate Rules Rule Set 24
Semantically Equivalent Adversarial Rules (SEARS) High Adversary Count Non-Redundancy What type → Which type color → colour Selected Rules What NOUN → Which NOUN Induces many flipped predictions Flips different predictions 25
Examples: VQA Visual7a-Telling [Zhu et al 2016] 26
Examples: Machine Comprehension BiDAF [Seo et al 2017] 27
Examples: Movie Review Sentiment Analysis FastText [Joulin et al 2016] 28
Experiments 29
1. SEAs vs Humans 30
Set up SEA (top 5) + Human Humans Top scored SEA Evaluate adversaries for semantic equivalence 31
How often can SEAs be produced? Sentiment Analysis Visual Question Answering 45 60 46 SEAs + Humans better than Humans SEAs find equivalent adversaries as often as Humans 36 33.6 51.3 34.5 42.5 23 33 33.8 11.5 26 25.3 25 0 Human SEA Human + SEA Human SEA Human + SEA Human SEA Human + SEA Human SEA Human + SEA 32
Humans produce different adversaries: Humans did not produce these: What kind of meat is on the boy’s plate? They are so easy to love… But they did produce these: How many suitcases? Also great directing and photography 33
2. SEARs vs Experts 34
Part 1: experts come up with rules Objective: maximize mistakes with good rules 35
Part 2: experts evaluate our SEARs Experts only accept good rules 36
Results Time (minutes) % correct predictions flipped 20 16 14.2 16.9 15 12 10.9 12.9 10.1 10 8 5.4 5 4 3.3 3 0 0 Visual QA Sentiment Visual QA Sentiment Finding Rules Evaluating SEARs Experts SEARs 37
3. Fixing bugs 38
Closing the loop (color → colour) (WP VBZ → WP’s) … Filter out bad rules Retrain model Data Augment training 39
Results Fix bugs, no loss in accuracy 14 % of flips due to bugs 12.6 12.6 10.5 7 3.4 3.5 1.4 0 Visual QA Sentiment Original Augmented 40
Conclusion SEARS SEA Semantics matter Models are prone to these bugs SEAs and SEARs help find and fix them 41
Semantically Equivalent Adversarial Rules for Debugging NLP Models Marco Tulio Ribeiro Carlos Guestrin Sameer Singh (UC Irvine) 42
Semantic scoring is still a research problem… Also: inaccurate for long texts 43
Problem: not comparable across instances good movie good good Good movie great Good film excellent 1 1 Great movie … 0.97 0.29 … 0.29 0.07 … … 0.35 0.7 0.34 0.2 0.1 0.05 … … 44
Examples: VQA 45
Examples: Movie Review Sentiment Analysis FastText [Joulin et al 2016] 46
47
Recommend
More recommend