quick question
play

Quick Question A doctor is walking down the street with a boy. The - PowerPoint PPT Presentation

Quick Question A doctor is walking down the street with a boy. The boy is the doctors son, but the doctor is not the boys father. How is that possible? GENDER BIAS IN WORD EMBEDDINGS 1 Simple Answer The doctor is the boys


  1. Quick Question • A doctor is walking down the street with a boy. The boy is the doctor’s son, but the doctor is not the boy’s father. How is that possible? GENDER BIAS IN WORD EMBEDDINGS 1

  2. Simple Answer • The doctor is the boy’s mother… GENDER BIAS IN WORD EMBEDDINGS 2

  3. Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them HILA GONEN AND YOAV GOLDBERG BAR ILAN UNIVERSITY INRIA PARIS 12/3/19 ACCEPTED TO NAACL 2019

  4. Outline • Background • Gender Bias in word embeddings • Current debiasing methods • Post-processing (Bolukbasi et al.) • During training (Zhao et al.) • Experiments that reveal the remaining bias • Conclusion GENDER BIAS IN WORD EMBEDDINGS 4

  5. Gender Bias in Applications 5

  6. What do we mean by gender bias? GENDER BIAS IN WORD EMBEDDINGS 6

  7. What do we mean by gender bias? GENDER BIAS IN WORD EMBEDDINGS 7

  8. What do we mean by gender bias? (Zhao et al.,NAACL, 2018) GENDER BIAS IN WORD EMBEDDINGS 8

  9. What do we mean by gender bias? (Hendricks et al., 2018) GENDER BIAS IN WORD EMBEDDINGS 9

  10. Word Embeddings • TopK lists: nurse (Mikolov et al. 2013) GENDER BIAS IN WORD EMBEDDINGS 10

  11. Word Embeddings • We will focus on gender bias in word embeddings GENDER BIAS IN WORD EMBEDDINGS 11

  12. Bias in word embeddings 12

  13. Bias in Word Embeddings GENDER BIAS IN WORD EMBEDDINGS 13

  14. Bias in Word Embeddings • Caliskan et al. replicate a spectrum of known biases from the literature using word embeddings • Show that text corpora contain several types of biases: • morally neutral as toward insects or flowers • problematic as toward race or gender • veridical, reflecting the distribution of gender with respect to careers or first names • Introduce methods for identifying these biases GENDER BIAS IN WORD EMBEDDINGS 14

  15. Bias in Word Embeddings GENDER BIAS IN WORD EMBEDDINGS 15

  16. Bias in Word Embeddings • Permutation test: • X, Y: sets of target words (e.g. male names vs. female names) • A, B: sets of attribute words (e.g. career terms vs. family terms) • P-value is: GENDER BIAS IN WORD EMBEDDINGS 16

  17. Bias in Word Embeddings Concepts 1 Concepts 2 Attributes 1 Attributes 2 Flowers : Insects : Pleasant : Unpleasant : buttercup, daisy, lily ant, caterpillar, flea freedom, health, love abuse, crash, filth European American names : African American names : Pleasant : Unpleasant : Brad, Brendan Darnell, Lakisha joy, love, peace agony, terrible Male attributes : Female attributes : Math words : Arts Words : male, man, boy female, woman, girl math, algebra, geometry poetry, art, dance GENDER BIAS IN WORD EMBEDDINGS 17

  18. Definitions of Gender Bias in Word Embeddings 18

  19. Definition of Gender Bias in Word Embeddings (NIPS, 2016) GENDER BIAS IN WORD EMBEDDINGS 19

  20. Definition of Gender Bias in Word Embeddings • We check how similar a word is to “he” and “she” (cosine similarity) • Note that we care about the difference between the two • This is the projection on the direction of “he – she”* * This is the gender direction, can be computed using several pairs together (e.g. man-woman, brother-sister) GENDER BIAS IN WORD EMBEDDINGS 20

  21. Definition of Gender Bias in Word Embeddings • bias(consultant) = -0.0023 • bias(nurse) = -0.2471 • bias(captain) = 0.1521 • bias(table) = -0.0003 Zhao et al. GENDER BIAS IN WORD EMBEDDINGS 21

  22. Reducing of Gender Bias in Word Embeddings 22

  23. Reduce Bias after Training • Bolukbasi et al. suggest to remove bias after training by removing the projection of every neutral word on the gender direction GENDER BIAS IN WORD EMBEDDINGS 23

  24. Reduce Bias after Training • 1. Define a gender direction: • 10 gender pair difference vectors: • woman, man | girl, boy | she, he | mother, father daughter, son | gal, guy | female, male | her, his herself, himself | Mary, John • Compute and use their principal component GENDER BIAS IN WORD EMBEDDINGS 24

  25. Reduce Bias after Training • 2. Define inherently neutral words: • Identify the set of gender specific words • The authors derive a list of 218 words from dictionary definitions: • mother, aunt, chairman, girlfriend, prince • The complementary set are the gender neutral words • The authors generalize the list to a broader vocabulary using SVM (~6500 words) GENDER BIAS IN WORD EMBEDDINGS 25

  26. Reduce Bias after Training • 3. Zero the projection of all neutral words on the gender direction: 𝑥 Projection of on gender direction • The bias of all neutral words is now zero by definition GENDER BIAS IN WORD EMBEDDINGS 26

  27. Reduce Bias after Training • 4. Equalize: • A family of equality sets (pairs): • For each pair, compute: normalized, same bias for both words • Equalize the words in the pair: GENDER BIAS IN WORD EMBEDDINGS 27

  28. Reduce Bias after Training • We will address these embeddings as HARD-DEBIASED GENDER BIAS IN WORD EMBEDDINGS 28

  29. Reduce bias during training (EMNLP, 2018) GENDER BIAS IN WORD EMBEDDINGS 29

  30. Reduce bias during training • Zhao et al. suggest to reduce bias during training: • Train word embeddings using GloVe (Pennington et al., 2014) • Alter the loss to encourage the gender information to concentrate in the last coordinate • To ignore gender information – simply remove the last coordinate GENDER BIAS IN WORD EMBEDDINGS 30

  31. Reduce bias during training • Loss: GloVe component (captures word proximity) GENDER BIAS IN WORD EMBEDDINGS 31

  32. Reduce bias during training • Use two groups of male/female seed words, and encourage words from different groups to differ in their last coordinate : Gender coordinate(s) • Seed words – according to WordNet • *The authors also experiment with another variant for this component GENDER BIAS IN WORD EMBEDDINGS 33

  33. Reduce bias during training • Encourage the representation of neutral-gender words (excluding the last coordinate) to be orthogonal to the gender direction : Vector w/o gender coordinate(s) GENDER BIAS IN WORD EMBEDDINGS 34

  34. Reduce bias during training • Gender direction is estimated on the fly • Averaging differences between pairs of male-female words GENDER BIAS IN WORD EMBEDDINGS 35

  35. Reduce bias during training • We will address these embeddings as GN-GLOVE GENDER BIAS IN WORD EMBEDDINGS 36

  36. Some Results • Bolukbasi et al.: • Bias of all inherently-neutral words is zero by definition • Generated analogies are less stereotyped GENDER BIAS IN WORD EMBEDDINGS 37

  37. Some Results • Zhao et al.: • Decrease bias in co-reference resolution GENDER BIAS IN WORD EMBEDDINGS 38

  38. Problem solved? • Not so fast… GENDER BIAS IN WORD EMBEDDINGS 39

  39. Clustering male- and female- biased words • We take the most biased words in the vocabulary according to the original bias (500 male-biased and 500 female-biased) • We cluster them into two clusters using K-means • The clusters align with gender with accuracy of: • 92.5% compared to 99.99% (HARD-DEBIASED) • 85.6% compared to 100% (GN-GLOVE) GENDER BIAS IN WORD EMBEDDINGS 40

  40. Clustering male- and female- biased words HARD-DEBIASED GN-GLOVE GENDER BIAS IN WORD EMBEDDINGS 41

  41. Bias-by-neighbors • Bias is still manifested by the word being close to socially-marked feminine words • A new mechanism for measuring bias: • The percentage of male/female socially-biased words among the k nearest neighbors of the target word • Pearson correlation with bias-by-projection: • 0.686 compared to 0.741 (HARD-DEBIASED) • 0.736 compared to 0.773 (GN-GLOVE) GENDER BIAS IN WORD EMBEDDINGS 42

  42. Professions • We take a predefined list of professions • We show correlation between the bias-by-projection and bias-by- neighbors, before and after debiasing GENDER BIAS IN WORD EMBEDDINGS 43

  43. Professions HARD-DEBIASED GN-GLOVE GENDER BIAS IN WORD EMBEDDINGS 44

  44. Association with stereotyped words • We evaluate the association between female/male names and female/male stereotyped words (experiments taken from Caliskan et al.) Female-associated Male-associated names Amy, Joan, Lisa John, Paul, Mike family vs. carrer home, parents, children executive, management, professional arts vs. math poetry, art, dance math, algebra, geometry arts vs. science dance, literature, novel science, technology, physics • All the associations have very small p-values GENDER BIAS IN WORD EMBEDDINGS 45

  45. Classifying to gender • Can a classifier learn to generalize from some gendered words to others based only on their representations? train 1000 SVM 5000 most biased words 4000 test GENDER BIAS IN WORD EMBEDDINGS 46

  46. Classifying to gender • Results: GENDER BIAS IN WORD EMBEDDINGS 47

  47. Conclusion • Word embeddings exhibit gender bias • Debiasing is hard! • Social gender bias is picked up from the data by the models • A lot of the bias information is still recoverable (even when the bias is low/zero according to the definition usually used) • The way we define the bias is important, and needs to be reconsidered when trying to solve the problem GENDER BIAS IN WORD EMBEDDINGS 48

  48. Questions? GENDER BIAS IN WORD EMBEDDINGS 49

  49. Thank you! GENDER BIAS IN WORD EMBEDDINGS 50

Recommend


More recommend