Semantically Equivalent Adversarial Rules for Debugging NLP Models - PowerPoint PPT Presentation

Semantically Equivalent Adversarial Rules for Debugging NLP Models Marco Tulio Ribeiro Carlos Guestrin Sameer Singh (UC Irvine) 1

NLP / ML models are getting smarter: VQA What type of road sign is shown? > STOP . Visual7A [Zhu et al 2016] 2

NLP / ML models are getting smarter: MC (SQuAD) The biggest city on the river Rhine is How long is the Rhine? Cologne, Germany with a population of more than 1,050,000 people. 1230km It is the second-longest river in Central and Western Europe (after BiDAF [Seo et al 2017] the Danube), at about 1,230 km (760 mi) Question: are they prone to oversensitivity? 3

Oversensitivity in images “panda” “gibbon” 57.7% confidence 99.3% confidence Adversaries are indistinguishable to humans… But unlikely in the real world (except for attacks) 4

Adversarial examples Find closest example with different prediction 5

What about text? What type of road sign is shown? > STOP . What type of road sign is What type of road sign is shown? sho wn? Perceptible by humans, unlikely in real world 6

What about text? What type of road sign is shown? > STOP . What type of road sign is shown? A single word changes too much! 7

Semantics matter What type of road sign is shown? > STOP . What type of road sign is shown? > Do not Enter. Bug, and likely in the real world 8

Semantics matter How long is the Rhine? > 1230km The biggest city on the river Rhine is Cologne, Germany with a population of more than 1,050,000 people. It is the second-longest river in How long is the Rhine? Central and Western Europe (after the Danube), at about 1,230 km (760 mi) > More than 1,050,000 Not all changes are the same: semantics matter 9

Adversarial Rules Find rule that generates many adversaries 10

Generalizing adversaries What type of road sign is shown? > STOP . What type of road sign is shown? > Do not Enter. - flips 3.9% of examples Rule What NOUN Which NOUN 11

Semantics matter What color is the sky? > Blue. What color is the sky? > Gray. - flips 3.9% of examples Rule What NOUN Which NOUN 12

Semantics matter How long is the Rhine? > 1230km The biggest city on the river Rhine is Cologne, Germany with a population of more than 1,050,000 people. It is the second-longest river in Central and Western Europe (after How long is the Rhine? the Danube), at about 1,230 km (760 mi) > More than 1,050,000 - flips 3% of examples Rule ? ?? 13

Semantics matter What is the oncorhynchus also called? > chum salmon Detailed investigation of chum salmon, Oncorhynchus keta, showed that these fish digest ctenophores 20 times as fast as an equal weight of shrimps. What is the oncorhynchus also called? > Oncorhynchus keta - flips 3% of examples Rule ? ?? 14

Adversarial Rules Rules are global and actionable, more interesting than individual adversaries 15

Semantically Equivalent Adversary (SEA) 16

Ingredients Semantic score function A black box model Different Semantically prediction Equivalent 17

Revisiting adversaries Find closest example with different prediction 18

Semantic Similarity: Paraphrasing [Mallinson et al, 2017] Portuguese pt - en en - pt Bom filme! Translation Good movie! Sentence X fr - en en - fr French Bon film! Score Translation Translators Back translators Good movie Good film Great movie Language … Movie good model 0.35 0.34 comes for free 0.1 … 19 0.001

Finding an adversary What color is the tray? Pink What colour is the tray? Green Which color is the tray? Green What color is it? Green What color is tray? Pink How color is the tray? Green 20

Semantically Equivalent Adversarial Rules (SEARs) 21

From SEAs to Rules Select Propose Small Find SEAs Candidate Rules Rule Set 22

Proposing Candidate Rules What type of road What type of road sign is shown? sign is shown? Candidate Rules: Exact Match (What → Which) (What type → Which type) What Which type of road sign is shown? (What NOUN → Which NOUN) What Which is the person looking at? Context (WP type → Which type) What Which was I thinking? (WP NOUN → Which NOUN) … Must not change semantics POS Tags 23

From SEAs to Rules Select Propose Small Find SEAs Candidate Rules Rule Set 24

Semantically Equivalent Adversarial Rules (SEARS) High Adversary Count Non-Redundancy What type → Which type color → colour Selected Rules What NOUN → Which NOUN Induces many flipped predictions Flips different predictions 25

Examples: VQA Visual7a-Telling [Zhu et al 2016] 26

Examples: Machine Comprehension BiDAF [Seo et al 2017] 27

Examples: Movie Review Sentiment Analysis FastText [Joulin et al 2016] 28

Experiments 29

1. SEAs vs Humans 30

Set up SEA (top 5) + Human Humans Top scored SEA Evaluate adversaries for semantic equivalence 31

How often can SEAs be produced? Sentiment Analysis Visual Question Answering 45 60 46 SEAs + Humans better than Humans SEAs find equivalent adversaries as often as Humans 36 33.6 51.3 34.5 42.5 23 33 33.8 11.5 26 25.3 25 0 Human SEA Human + SEA Human SEA Human + SEA Human SEA Human + SEA Human SEA Human + SEA 32

Humans produce different adversaries: Humans did not produce these: What kind of meat is on the boy’s plate? They are so easy to love… But they did produce these: How many suitcases? Also great directing and photography 33

2. SEARs vs Experts 34

Part 1: experts come up with rules Objective: maximize mistakes with good rules 35

Part 2: experts evaluate our SEARs Experts only accept good rules 36

Results Time (minutes) % correct predictions flipped 20 16 14.2 16.9 15 12 10.9 12.9 10.1 10 8 5.4 5 4 3.3 3 0 0 Visual QA Sentiment Visual QA Sentiment Finding Rules Evaluating SEARs Experts SEARs 37

3. Fixing bugs 38

Closing the loop (color → colour) (WP VBZ → WP’s) … Filter out bad rules Retrain model Data Augment training 39

Results Fix bugs, no loss in accuracy 14 % of flips due to bugs 12.6 12.6 10.5 7 3.4 3.5 1.4 0 Visual QA Sentiment Original Augmented 40

Conclusion SEARS SEA Semantics matter Models are prone to these bugs SEAs and SEARs help find and fix them 41

Semantically Equivalent Adversarial Rules for Debugging NLP Models Marco Tulio Ribeiro Carlos Guestrin Sameer Singh (UC Irvine) 42

Semantic scoring is still a research problem… Also: inaccurate for long texts 43

Problem: not comparable across instances good movie good good Good movie great Good film excellent 1 1 Great movie … 0.97 0.29 … 0.29 0.07 … … 0.35 0.7 0.34 0.2 0.1 0.05 … … 44

Examples: VQA 45

Examples: Movie Review Sentiment Analysis FastText [Joulin et al 2016] 46

Semantically Equivalent Adversarial Rules for Debugging NLP Models - PowerPoint PPT Presentation

Semantically Equivalent Adversarial Rules for Debugging NLP Models Marco Tulio Ribeiro Carlos Guestrin Sameer Singh (UC Irvine) 1 NLP / ML models are getting smarter: VQA What type of road sign is shown? > STOP . Visual7A [Zhu et al

Equivalences 1 Equivalence Definition (Equivalence) Two formulas F and G are (semantically)

Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification Lianhui

Debugging Debugging with High Level Languages Same goals as low-level debugging Examine and

ASL-English Semantically Mismatched Code Blends An Analysis of Motivations for Nonequivalent

NLP William Wang Sameer Singh Slides: http://tiny.cc/adversarial With contributions from Jiwei

Debugging Debugging Tools Module Overview Introduction to Debugging Problems in Production

Equivalent Circuits: Voltage R eq Thevenin Theorem Thevenin Equivalent Circuit

Logic As a Query Language If-then logical rules have been used in many systems. Datalog

Visual Debugging Software What is Debugging Visualization Visualizing

Debugging Techniques for C Programs Debugging Basics Will focus on the gcc/gdb combination.

The 2014 Amendments to the Uniform Fraudulent Transfer Act: Preparing for the New Rules

1 Greedy Sequential Covering Example Greedy Sequential Covering Example Y Y + + + + + +

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

1 Crashes, interrupts, and backtrace Watchpoints GDB will automatically stop if the program

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Coroutines Update Seva Tolstopyatov @qwwdfsad October 13, 2020 Coroutines debugging Coroutines

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

y ( y log x x a ) a is equivalent to f(x) = log a x Logarithmic

Scalable Post-Mortem Debugging Abel Mathew CEO - Backtrace amathew@backtrace.io @nullisnt0

? AdVersarial: Defeating Perceptual Ad Blocking with Adversarial Examples Florian Tramr

Advanced Production Debugging About Me Co-founder Takipi, JVM Production Debugging. Director,

SECURITY, ADVERSARIAL SECURITY, ADVERSARIAL LEARNING, AND PRIVACY LEARNING, AND PRIVACY

Adversarial Examples Hanxiao Liu April 2, 2018 1 / 22 Adversarial Examples Inputs to ML

Embedded Software TI2726-B 8. Debugging techniques Koen Langendoen Embedded Software Group