biases in nlp models and what it takes to control them
play

Biases in NLP Models and What It Takes to Control them Kai-Wei - PowerPoint PPT Presentation

Biases in NLP Models and What It Takes to Control them Kai-Wei Chang 1 A carton of ML (NLP) pipeline Prediction Evaluation (Structured) Inference Auxiliary Corpus/Models Representation (e.g, word embedding) Data Kai-Wei Chang


  1. Biases in NLP Models and What It Takes to Control them Kai-Wei Chang 1

  2. A carton of ML (NLP) pipeline Prediction Evaluation (Structured) Inference Auxiliary Corpus/Models Representation (e.g, word embedding) Data Kai-Wei Chang (kw@kwchang.net) 2

  3. Motivate Example: Coreference Resolution • Coreference resolution is biased 1,2 • Model fails for female when given same context Semantics Only w/ Syntactic Cues his ⇒ her 1 Zhao et al. Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods. NAACL 2018. 2 Rudinger et al. Gender Bias in Coreference Resolution. NAACL 2018 Kai-Wei Chang (kw@kwchang.net) 3

  4. Wino-bias data v Stereotypical dataset v Anti-stereotypical dataset Kai-Wei Chang (kw@kwchang.net) 4

  5. Gender bias in Coref System 78 73 68 63 58 53 48 Neural Coref Model E2E E2E (Debiased WE) E2E (Full model) Steoetype Anti-Steoretype Avg Kai-Wei Chang (kw@kwchang.net) 5

  6. Gender bias in Coref System 78 73 68 63 58 53 48 Neural Coref Model E2E E2E (Debiased WE) E2E (Full model) Steoetype Anti-Steoretype Avg Kai-Wei Chang (kw@kwchang.net) 6

  7. Gender bias in Coref System 78 73 68 63 58 53 48 Neural Coref Model Mitigate WE Bias Mitigate Data Bias E2E E2E (Debiased WE) E2E (Full model) Steoetype Anti-Steoretype Avg Kai-Wei Chang (kw@kwchang.net) 7

  8. Misrepresentation and Bias Kai-Wei Chang (kw@kwchang.net) 8

  9. Stereotypes Which word is more likely to be used by a female ? Giggle – Laugh (Preotiuc-Pietro et al. ‘16) Credit: Yulia Tsvetkov Kai-Wei Chang (kw@kwchang.net) 9

  10. Stereotypes Which word is more likely to be used by a female ? Giggle – Laugh (Preotiuc-Pietro et al. ‘16) Credit: Yulia Tsvetkov Kai-Wei Chang (kw@kwchang.net) 10

  11. Stereotypes Which word is more likely to be used by a older person ? Impressive – Amazing (Preotiuc-Pietro et al. ‘16) Credit: Yulia Tsvetkov Kai-Wei Chang (kw@kwchang.net) 11

  12. Stereotypes Which word is more likely to be used by a older person ? Impressive – Amazing (Preotiuc-Pietro et al. ‘16) Credit: Yulia Tsvetkov Kai-Wei Chang (kw@kwchang.net) 12

  13. Why do we intuitively recognize a default social group? Credit: Yulia Tsvetkov 13

  14. Why do we intuitively recognize a default social group? Implicit Bias Credit: Yulia Tsvetkov 14

  15. BIASED AI Data is riddled with Implicit Bias Modified from Yulia Tsvetkov’s slide 15

  16. Bias in Wikipedia v Only small portion of editors are female v Have less extensive articles about women v Have fewer topics important to women. (Ruediger et al., 2010) Kai-Wei Chang (kw@kwchang.net) 16

  17. BIASED AI Consequence: models are biased Credit: Yulia Tsvetkov 17

  18. Bias in Language Generation The Woman Worked as a Babysitter: On Biases in Language Generation (Sheng EMNLP 2019) • Language generation is biased (GPT-2) Kai-Wei Chang (kw@kwchang.net) 18

  19. Where’s Biases? Kai-Wei Chang (kw@kwchang.net) 19

  20. A carton of ML (NLP) pipeline Prediction Evaluation (Structured) Inference Auxiliary Corpus/Models Representation (e.g, word embedding) Data Kai-Wei Chang (kw@kwchang.net) 20

  21. Representational Harm in NLP: Word Embeddings can be Sexist Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings [ Bolukbasi et al. NeurIPS16] Given gender direction ( 𝑤 #$ − 𝑤 &#$ ) , find word pairs with parallel direction by cos(𝑤 , − 𝑤 - , 𝑤 #$ − 𝑤 &#$ ) he: _______ she:_______ brother sister she beer cocktail he physician registered_nurse professor associate professor Google w2v embedding trained from the news Kai-Wei Chang (kw@kwchang.net) 21

  22. Implicit association test (IAT) v Greenwald et al. 1998 v Detect the strength of a person's subconscious association between mental representations of objects (concepts) Boy Math Girl Reading https://en.wikipedia.org/wiki/Implicit-association_test https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 22

  23. Implicit association test (IAT) Boy Girl https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 23

  24. Implicit association test (IAT) Boy Girl Emily https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 24

  25. Implicit association test (IAT) Boy Girl Tom https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 25

  26. Implicit association test (IAT) Math Reading https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 26

  27. Implicit association test (IAT) Math Reading number https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 27

  28. Implicit association test (IAT) Boy Girl Math Reading https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 28

  29. Implicit association test (IAT) Boy Girl Math Reading Algebra https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 29

  30. Implicit association test (IAT) Boy Girl Math Reading Julia https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 30

  31. Implicit association test (IAT) Boy Girl Reading Math https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 31

  32. Implicit association test (IAT) Boy Girl Reading Math Literature https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 32

  33. Implicit association test (IAT) Boy Girl Reading Math Dan https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 33

  34. Implicit association test (IAT) https://implicit.harvard.edu Kai-Wei Chang (kw@kwchang.net) 34

  35. Word Embedding Association Test (WEAT) • X : “mathematics”, “science”; Y : “arts”, “design” • A : “male”, “boy”; B : “female”, “girl” “mathematics” “male”, “boy” “female”, “girl” Caliskan et al. Semantics derived automatically from language corpora contain human-like biases Science. 2017 35 Kai-Wei Chang (kw@kwchang.net)

  36. Word Embedding Association Test (WEAT) • X : “mathematics”, “science”; Y : “arts”, “design” • A : “male”, “boy”; B : “female”, “girl” Differential association of the two sets of words with the Aggregate the target words attributes Caliskan et al. Semantics derived automatically from language corpora contain human-like biases Science. 2017 36 Kai-Wei Chang (kw@kwchang.net)

  37. Word Embedding Association Test (WEAT) • X : “mathematics”, “science”; Y : “arts”, “design” • A : “male”, “boy”; B : “female”, “girl” The effect size of bias: Caliskan et al. Semantics derived automatically from language corpora contain human-like biases Science. 2017 37 Kai-Wei Chang (kw@kwchang.net)

  38. Word Embedding Association Test Caliskan et al. (2017) IAT WEAT Kai-Wei Chang (kw@kwchang.net) 38

  39. Word Embedding Association Test Caliskan et al. (2017) Kai-Wei Chang (kw@kwchang.net) 39 WEAT finds similar biases in Word Embeddings as IAT did for humans

  40. she he father mother king queen Kai-Wei Chang (kw@kwchang.net) 40

  41. Kai-Wei Chang (kw@kwchang.net) 41

  42. Can we Extend the Analysis beyond Binary Gender? Kai-Wei Chang (kw@kwchang.net) 42

  43. Beyond Gender & Race/Ethnicity Bias Manzini et al. NAACL 2019 Biases in word embeddings trained on Kai-Wei Chang (kw@kwchang.net) 43 the Reddit data from US users.

  44. How about other Embedding? Kai-Wei Chang (kw@kwchang.net) 44

  45. Bias Only in English? v Language with grammatical gender v Morphological agreement (Zhou et al, EMNLP 2019) Kai-Wei Chang (kw@kwchang.net) 45

  46. v Linear Discriminative Analysis (LDA) v Identify grammatical gender direction feminine words masculine words Kai-Wei Chang (kw@kwchang.net) 46

  47. masculine Female Male feminine Kai-Wei Chang (kw@kwchang.net) 47

  48. masculine Female Male feminine Kai-Wei Chang (kw@kwchang.net) 48

  49. masculine Female Male feminine Kai-Wei Chang (kw@kwchang.net) 49

  50. How about bilingual embedding? [Zhou et al. EMNLP19] Female doctor in Spanish male doctor in Spanish Kai-Wei Chang (kw@kwchang.net) 50

  51. How about Contextualized Representation? Gender Bias in Contextualized Word Embeddings Zhao et al. NAACL 19 v First two components explain more variance than others (Feminine) The driver stopped the car at the hospital because she was paid to do so (Masculine) The driver stopped the car at the hospital because he was paid to do so gender direction: ELMo(driver) – ELMo(driver) Kai-Wei Chang (kw@kwchang.net) 51

Recommend


More recommend