How Computer Algorithms Expose Our Hidden Biases And How To Fix - PowerPoint PPT Presentation

How Computer Algorithms Expose Our Hidden Biases And How To Fix Them Victor Zimmermann LXIV. StuTS Computational Linguistics Department Heidelberg University

The Shitstorm cometh.

What happened? 1 lxiv. stuts | white man explains racism

The Netflix Artwork Controversy Why are the tabloids up in arms over Netflix adverts? Welcome to a stereotypical machine learning controversy . 2 lxiv. stuts | white man explains racism

The algorithm “If the artwork representing a title captures something compelling to you, then it acts as a gateway into that title and gives you some visual ‘evidence’ for why the title might be good for you.” [Cha+17] Figure 1: Different artworks for romance and comedy viewers. 3 lxiv. stuts | white man explains racism

Twitter outrage 4 lxiv. stuts | white man explains racism

Netflix’ Response “We don’t ask members for their race, gender or ethnicity so we cannot use this information to personalise their individual Netflix experience. The only information we use is a member’s viewing history.” [Iqb18] 5 lxiv. stuts | white man explains racism

Nobody expects the Patriarchy.

Sources of Bias There are some obvious reasons for bias in machine learning: • Your training data is bad. • Your algorithm is bad. 6 • You are bad. And you should feel bad. lxiv. stuts | white man explains racism

Bad Training Data

Human Language Spoiler : All human language is biased. Bias in not necessarily performance based . [Tan90][GMS98] Instead it can also be encoded in orthography , lexicography or grammar of a language. • Asymmetrically marked gender (generic masculine, e.g. actor vs actress) • Naming conventions (e.g. Chastity vs. Bob) [Swe13] 1 Wikipedia lists 22 misogynistic and 5 misandric slurs. 7 • Quantity of gendered insults 1 [Sta77] lxiv. stuts | white man explains racism

Word Embeddings Donnelly. The rally ended at about 3 Nurse Homemaker Paris What are Word Embeddings? Now you can do maths with words!? . . . p.m. and then spoke a rally at ... 8 Joe Obama spoke Sunday afternoon at a supporting Democrat U.S. Sen. Obama campaigned in Chicago and CHICAGO – Former President Barack Condensed mathematical representations of collocations. [Mik+13] northwest Indiana on Sunday, just tions. days ahead of Tuesday’s midterm elec- get-out-the-vote rally in Gary, Indiana, − − − − − → Obama ( 0 . 2 , 0 . 6 , ... ) − − − − → speaks ( 0 . 1 , 0 . 8 , ... ) − − − − − → Chicago ( 0 . 3 , 0 . 2 , ... ) ⇒ ⇒ − − − → press ( 0 . 0 , 0 . 5 , ... ) − − → Queen − − − → France = − − − → King − − − → Man + − − − − − → Woman = − − − − → Berlin − − − − − − − → Germany + − − − − → − − − − − − − − − → Programmer − − − → Man + − − − − − → Woman = − − − − − − − − → − − − − − → Surgeon − − − → Man + − − − − − → Woman = − − − → lxiv. stuts | white man explains racism

Word Embeddings What are Word Embeddings used for? • Similarity Measures [Kus+15] • Machine Translation [Zou+13] • Sentence Classification [Kim14] • Part-of-Speech-Tagging [SZ14][RRZ18] • Dependency Parsing [CM14] • Semantic Modelling [Fu+14] • Coreference Resolution [Lee+17] Basically the entire field of Computational Linguistics. 9 lxiv. stuts | white man explains racism

Mathematical Sledgehammer What if we just remove gender? Figure 2: Mind = Blown 10 lxiv. stuts | white man explains racism

Mathematical Sledgehammer • Take “good” analogies, e.g. man-woman, he-she, king-queen, etc. • Extract some average “gender vector” from their embeddings. • Substract this new vector from all other relations. - Not applicable to most other kinds of bias. 11 lxiv. stuts | white man explains racism

Mathematical Sledgehammer (in beautiful) being the means of the defining Word sets W , defining subsets n rows of SVD( C ), where Bias subspace B consists of the first k subsets. 12 with Words to neutralise N ∈ W , family of D 1 , D 2 , ..., D n ⊂ W , embedding equality sets ε := { E 1 , E 2 , ..., E m } , { w ∈ R d } w ∈ W , integer parameter k ≥ 1, E i ⊆ W , with reembedded words w ∈ N defined as ∑ µ i := w / | D i | w := ( w − w B ) / | w − w B | w ∈ D i . For each set E ∈ ε , let ∑ µ := w / | E | w ∈ E v := µ − µ B 1 − | v | 2 w B − µ B √ For each w ∈ E , w := v + ∑ ∑ C := ( w − µ i ) T ( w − µ i ) / | D i | . | w B − µ B | w ∈ D i i = 1 lxiv. stuts | white man explains racism

Bad Algorithms

Google’s Image Recognition Controversy Google automatically labels pictures according to their content. Problem: Their algorithm is bad. Source: @jackyalcine on Twitter 13 lxiv. stuts | white man explains racism

Google’s Image Recognition Controversy Their solution: Source: www.theverge.com (visited on 2018-11-06) 14 lxiv. stuts | white man explains racism

No easy solutions. Not one of these solutions is really good . • Total avoidance of problem. [Iqb18] • Limited applicability. [Bol+16] • Exploitation of false classification. [BGO16] • Introduction of even more priors and meta parameters. [Zha+17] 15 lxiv. stuts | white man explains racism

Bad People

Facebook Actual Quote from an actual Facebook Employee “We started out of a college dorm. I mean, c’mon, we’re Facebook. We never wanted to deal with this shit.” [Sha16] 16 lxiv. stuts | white man explains racism

Facebook Possible cause of this apathy: (Don’t quote me on this.) 17 lxiv. stuts | white man explains racism

Help, my Chatbot joined the KKK!

Microsoft Tay 18 lxiv. stuts | white man explains racism

Microsoft Tay 19 lxiv. stuts | white man explains racism

Microsoft Tay What can we learn from this? • Tay is a chat bot.Tay is a chat bot. • Tay is down with the kids?Tay is down with the kids? • Tay learns from Twitter data. 20 lxiv. stuts | white man explains racism

Microsoft Tay The absolutely expected happens... Source: www.theguardian.com (visited on 2018-11-19) 21 lxiv. stuts | white man explains racism

What should you take away from this talk? • Just because something uses “machine learning” doesn’t mean it is unbiased. • All language is implicitly prejudiced. • Training data does make a difference. • Diverse staff makes a difference. • Testing your system makes a difference. 22 lxiv. stuts | white man explains racism

What should you take away from this talk? Don’t listen to chat bots. They may act human. 23 lxiv. stuts | white man explains racism

Appendix

Language Classification Common language identification systems use extensive news corpora for training. + Big corpora in most languages. + Mostly unbiased “unbiased” texts. - Written in main dialect. - Privileged writing staff. Problem : African American English is 20% less likely to be classified as English than Standard English. [BO17] lxiv. stuts | white man explains racism

Language Classification Solution by Blodgett, Green, and O’Connor (2016): 1. Use US Census data und geolocated tweets to estimate race of user, 2. Train classifier to identify “race” of a given tweet, based on high AA tweets from first set. Result: • Build new corpus from high AA tweets. • (Find out that “Asian” captures all foreign languages and use that fact for classification.) lxiv. stuts | white man explains racism

References [Ang+16] Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. “Machine bias: There’s software used across the country to ProPublica, May 23 (2016). [BGO16] Su Lin Blodgett, Lisa Green, and Brendan O’Connor. “Demographic dialectal variation in social media: A case study of African-American English”. In: arXiv preprint arXiv:1608.08868 (2016). predict future criminals. and it’s biased against blacks”. In: lxiv. stuts | white man explains racism

References [BO17] pp. 183–186. issn: 10959203. arXiv: 1608.07187 . “Semantics derived automatically from language corpora Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. [CBN17] Programmer as Woman is to Homemaker? Debiasing Word Venkatesh Saligrama, and Adam Kalai. “Man is to Computer Tolga Bolukbasi, Kai-wei Chang, James Zou, [Bol+16] arXiv:1707.00061 (2017). Natural Language Processing: A Case Study of Social Media Su Lin Blodgett and Brendan O’Connor. “Racial Disparity in African-American English”. In: arXiv preprint Embeddings”. In: Nips (2016), pp. 1–9. contain human-like biases”. In: Science 356.6334 (2017), lxiv. stuts | white man explains racism

How Computer Algorithms Expose Our Hidden Biases And How To Fix - PowerPoint PPT Presentation

How Computer Algorithms Expose Our Hidden Biases And How To Fix Them Victor Zimmermann LXIV. StuTS Computational Linguistics Department Heidelberg University The Shitstorm cometh. What happened? 1 lxiv. stuts | white man explains racism

Measuring social biases in human annotators using counterfactual queries in Crowdsourcing BHAVYA

Visibility Determination AKA, hidden surface elimination Visibility Algorithms Roger Crawfis

Hidden Subgroup Hidden Subgroup Def. A Map is said to have A Map

Inspecting the Structural Biases of Dependency Parsing Algorithms Yoav Goldberg and Michael

Behavioural models Cognitive biases Marcus Bendtsen Department of Computer and Information

Capital Budgeting: Biases (Welch, Chapter 13-5) Ivo Welch More Biases Overconfidence Are you

Rendering: 1960s (visibility) Rendering: 1960s (visibility) Roberts (1963), Appel (1967) -

CULLING AND HIDDEN SURFACE ELIMINATION ALGORITHMS Graphics & Visualization: Principles &

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Hidden Markov

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Hidden Markov

CS-184: Computer Graphics Lecture #10: Clipping and Hidden Surfaces Prof. James OBrien

Unconscious Bias 1 Questions to Start: Are we aware of our unconscious biases? Do we accept

Hidden Markov Models Terminology, Representation and Basic Problems The next two weeks Hidden

Breakout for small group discussions Topics Understanding your biases Awareness of

Hidden surface removal Visibility of primitives Clipping algorithms will discard objects or

From Computer Operators to Operating Systems: The Hidden Costs of Business Computing Nathan

Hidden Markov Models Training Selecting model parameters What we know The terminology and

Algorithms for prerequisites Computer Games fundamentals of algorithms and data structures

Computer Graphics (CS 543) Lecture 9: Clipping, Viewport Transformation & Hidden Surface

Investigating Potential Investigating Potential Biases in Aerosol Light Biases in Aerosol Light

Quantum algorithms for the hidden shift problem of Boolean functions Maris Ozols University of

Heuristics and biases Tina Nane 2 Heuristics and biases Lotto Icon by Dapete is

TEXT AND TEXT AND AUTOMATED BIASES AUTOMATED BIASES NATURAL LANGUAGES ARE THE NATURAL

Biases in Decision Making Alexander Felfernig alexander.felfernig@ist.tugraz.at Decision Biases

How Computer Algorithms Expose Our Hidden Biases And How To Fix - PowerPoint PPT Presentation

How Computer Algorithms Expose Our Hidden Biases And How To Fix Them Victor Zimmermann LXIV. StuTS Computational Linguistics Department Heidelberg University The Shitstorm cometh. What happened? 1 lxiv. stuts | white man explains racism

Measuring social biases in human annotators using counterfactual queries in Crowdsourcing BHAVYA

Visibility Determination AKA, hidden surface elimination Visibility Algorithms Roger Crawfis

Hidden Subgroup Hidden Subgroup Def. A Map is said to have A Map

Inspecting the Structural Biases of Dependency Parsing Algorithms Yoav Goldberg and Michael

Behavioural models Cognitive biases Marcus Bendtsen Department of Computer and Information

Capital Budgeting: Biases (Welch, Chapter 13-5) Ivo Welch More Biases Overconfidence Are you

Rendering: 1960s (visibility) Rendering: 1960s (visibility) Roberts (1963), Appel (1967) -

CULLING AND HIDDEN SURFACE ELIMINATION ALGORITHMS Graphics &amp; Visualization: Principles &amp;

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Hidden Markov

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Hidden Markov

CS-184: Computer Graphics Lecture #10: Clipping and Hidden Surfaces Prof. James OBrien

Unconscious Bias 1 Questions to Start: Are we aware of our unconscious biases? Do we accept

Hidden Markov Models Terminology, Representation and Basic Problems The next two weeks Hidden

Breakout for small group discussions Topics Understanding your biases Awareness of

Hidden surface removal Visibility of primitives Clipping algorithms will discard objects or

From Computer Operators to Operating Systems: The Hidden Costs of Business Computing Nathan

Hidden Markov Models Training Selecting model parameters What we know The terminology and

Algorithms for prerequisites Computer Games fundamentals of algorithms and data structures

Computer Graphics (CS 543) Lecture 9: Clipping, Viewport Transformation &amp; Hidden Surface

Investigating Potential Investigating Potential Biases in Aerosol Light Biases in Aerosol Light

Quantum algorithms for the hidden shift problem of Boolean functions Maris Ozols University of

Heuristics and biases Tina Nane 2 Heuristics and biases Lotto Icon by Dapete is

TEXT AND TEXT AND AUTOMATED BIASES AUTOMATED BIASES NATURAL LANGUAGES ARE THE NATURAL

Biases in Decision Making Alexander Felfernig alexander.felfernig@ist.tugraz.at Decision Biases

CULLING AND HIDDEN SURFACE ELIMINATION ALGORITHMS Graphics & Visualization: Principles &

Computer Graphics (CS 543) Lecture 9: Clipping, Viewport Transformation & Hidden Surface