Why is My Classifier Discriminatory? Irene Y. Chen, Fredrik D. - PowerPoint PPT Presentation

Why is My Classifier Discriminatory? Irene Y. Chen, Fredrik D. Johansson, David Sontag Massachusetts Institute of Technology (MIT) NeurIPS 2018, Poster #120 Thurs 12/6 10:45am – 12:45pm @ 210 & 230

It is su y to make a surprisi singly y easy discriminatory algorithm.

Source: Shutterstock

Asian Black Hispanic Other White 0 . 16 0 . 18 0 . 20 0 . 22 Zero-one loss

In this paper 1. We want to find the sources of unfairness to guide resource allocation.

In this paper 1. We want to find the sources of unfairness to guide resource allocation. 2. We decompose unfairness into bias, variance, and noise.

In this paper 1. We want to find the sources of unfairness to guide resource allocation. 2. We decompose unfairness into bias, variance, and noise. 3. We demonstrate methods to guide feature augmentation and training data collection to fix unfairness.

Classification fairness: many factors Model • Loss function constraints • Kamairan et al, 2010; Zafar et al, 2017 • Representation learning • Zemel et al, 2013 • Regularization • Kamishima et al, 2007; Bechvod and Ligett, 2017 • Tradeoffs • Chouldechova, 2017; Kleinberg et al, 2016; Corbett-Davies et al, 2017

Classification fairness: many factors Model Data • Loss function constraints • Kamairan et al, 2010; Zafar et al, 2017 • Representation learning • Zemel et al, 2013 • Regularization • Kamishima et al, 2007; Bechvod and Ligett, 2017 • Tradeoffs • Chouldechova, 2017; Kleinberg et al, 2016; Corbett-Davies et al, 2017

Classification fairness: many factors Model Data • Data processing • Loss function constraints • Haijan and Domingo-Ferrer, • Kamairan et al, 2010; Zafar et al, 2017 • Representation learning 2013; Feldman et al, 2015 • Cohort selection • Zemel et al, 2013 • Regularization • Sample size • Kamishima et al, 2007; Bechvod and • Number of features Ligett, 2017 • Group distribution • Tradeoffs • Chouldechova, 2017; Kleinberg et al, 2016; Corbett-Davies et al, 2017

Classification fairness: many factors We should examine fairness Model Data • Data processing • Loss function constraints algorithms in the context of • Haijan and Domingo-Ferrer, • Kamairan et al, 2010; Zafar et al, 2017 • Representation learning 2013; Feldman et al, 2015 • Cohort selection • Zemel et al, 2013 the da data a and and mode del . • Regularization • Sample size • Kamishima et al, 2007; Bechvod and • Number of features Ligett, 2017 • Group distribution • Tradeoffs • Chouldechova, 2017; Kleinberg et al, 2016; Corbett-Davies et al, 2017

Why might my classifier be unfair?

Why might my classifier be unfair? True data function

Why might my classifier be unfair? Learned model

Why might my classifier be unfair? Learned model True data function

Why might my classifier be unfair? Error from variance ce can be solved Learned model True data function by collect mples . cting mo more sa samp

Why might my classifier be unfair? Learned model Orange dot model error

Why might my classifier be unfair? Learned model Orange dot model error Blue dot model error

Why might my classifier be unfair? 𝒛 = 𝟏. 𝟔𝒚 𝟑 True data function 𝒛 = 𝒚 − 𝟐

Why might my classifier be unfair? 𝒛 = 𝟏. 𝟔𝒚 𝟑 Error from bi s can be solved bias True data function by ch changing the model cl class . 𝒛 = 𝒚 − 𝟐

Why might my classifier be unfair? Learned model Orange dot model error

Why might my classifier be unfair? Learned model Orange dot model error Blue dot model error

Why might my classifier be unfair? Error from no se can be solved noise Learned model Orange dot model error by collect cting mo more featur tures . Blue dot model error

How do we define fairness?

How do we define fairness? We define fairness in the context of loss like false positive rate, false negative rate, etc. + : For example, zero-one loss for data D and prediction 𝑍 +, 𝑍, 𝐸 ∶= 𝑄 2 𝑍 + ≠ 𝑍 𝐵 = 𝑏) 𝛿 - 𝑍 We can then formalize unfairness as group differences. 9 𝑍 + Γ ∶= | 𝛿 ; − 𝛿 < | We rely on accurate Y labels and focus on algorithmic error

How do we define fairness? We define fairness in the context of loss like false positive rate, false negative rate, etc. + : For example, zero-one loss for data D and prediction 𝑍 +, 𝑍, 𝐸 ∶= 𝑄 2 𝑍 + ≠ 𝑍 𝐵 = 𝑏) 𝛿 - 𝑍 We can then formalize unfairness as group differences. 9 𝑍 + Γ ∶= | 𝛿 ; − 𝛿 < | We rely on accurate Y labels and focus on algorithmic error.

Why might my classifier be unfair? Theorem 1: For error over group a given predictor 𝑍 + : + 9 - 𝑍 + + 𝑊 9 +) + 𝑂 C - 𝛿̅ - 𝑍 = 𝐶 - (𝑍 Note that 𝑂 - indicates the expectation of 𝑂 - over X and data D . 9: = |𝛿 ; Accordingly, the expected discrimination level Γ C − 𝛿̅ < | can be decomposed into differences in bias, differences in variance, and differences in noise. 9 = (𝐶 9 ; − 𝐶 9 < + (𝑊 9 ; −𝑊 9 < ) + (𝑂 C ; −𝑂 C < )| Γ

Mortality prediction from MIMIC-III clinical notes 1. We found statistically significant racial differences Asian in zero-one loss. Black Hispanic Other White 0 . 16 0 . 18 0 . 20 0 . 22 Zero-one loss Asian Black Hispanic Other White

Mortality prediction from MIMIC-III clinical notes 1. We found statistically significant racial differences 0.27 in zero-one loss. 0.25 Zero-one loss 2. By subsampling data, we fit inverse power laws to 0.23 estimate the benefit of more 0.21 data and reducing variance. 0.19 0 5000 10000 15000 Training data size Asian Black Hispanic Other White

Mortality prediction from MIMIC-III clinical notes 0 . 35 1. We found statistically 2564 significant racial differences 0 . 30 in zero-one loss. 1877 0 . 25 Error enrichment 2. By subsampling data, we fit 1106 4181 inverse power laws to 19711 0 . 20 estimate the benefit of more 0 . 15 619 data and reducing variance. 17649 0 . 10 3. Using topic modeling, we 736 2100 identified subpopulations to 1211 0 . 05 gather more features to 0 . 00 reduce noise. Cancer patients Cardiac patients Asian Black Hispanic Other White

Where do we go from here? 1. For accurate and fair models deployed in real world applications, both the data and model should be considered. 2. Using easily implemented fairness checks , we hope others will check their algorithms for bias, variance, and noise-- which will guide further efforts to reduce unfairness. Come to poster #120 in Room 210 & 230.

Why is My Classifier Discriminatory? Irene Y. Chen, Fredrik D. - PowerPoint PPT Presentation

Why is My Classifier Discriminatory? Irene Y. Chen, Fredrik D. Johansson, David Sontag Massachusetts Institute of Technology (MIT) NeurIPS 2018, Poster #120 Thurs 12/6 10:45am 12:45pm @ 210 & 230 It is su y to make a surprisi singly y

The Nave Bayes Classifier Machine Learning 1 Todays lecture The nave Bayes Classifier

Ecuadorian labor market Mauricio Cuesta Zapata September 2019 Definition discriminatory

Lazy Associative Classification Decision Tree Classifier (Eager) Associative Classifier By

When and Why to use a Classifier? When and Why to use a Classifier? Alan Rector Alan Rector

When and Why to use a Classifier? When and Why to use a Classifier? Alan Rector Alan Rector

Data Mining with Weka Class 2 Lesson 1 Be a classifier! Ian H. Witten Department of Computer

Lecture 2: Nearest Neighbour Classifier Aykut Erdem September 2017 Hacettepe University Your

Maximum Entropy Classifier Ensembling using Ge- netic Algorithm for NER in Bengali Asif Ekbal 1

Data Classification Linear Classifier II Latent Differential Analysis Mean Classification

Classifier Selection Nicholas Ver Hoeve Craig Martek Ben Gardner Classifier Ensembles Assume

Classifier Classifier Systems Systems

PRE REJU JUDICIAL AL PREFERENCES The Discriminatory Selection Practices of Colbys Greek

The Discriminatory Effect of Domestic Regulations on International Trade in Services: Evidence

GC 0113 The open, transparent, non discriminatory and timely publication of the generic and/ or

LAWS AFFECT WOMEN THROUGHOUT THEIR WORKING LIVES Women navigate discriminatory laws and

The Analysis of Placement Values for Evaluating Discriminatory Measures Margaret Sullivan Pepe

Logistic regression on Sonar Machine Learning Toolbox Classification models Categorical

1 2 Where in the World is Stepping Up? American Psychiatric Association (San Diego, Calif.) 3

12. Classical statistics Andrej Bogdanov Estimators X = ( X 1 , , X n ) independent samples ^

Information Theory and Software Testing David Clark David Clark IT and ST Papers Squeeziness: A

Hearing #13 on Competition and Consumer Protection in the 21st Century Federal Trade Commission

Neural Text Classification Diyi Yang Some slides borrowed from Jacob Eisenstein (was at GT) and

SKILL-BASED OCCUPATION RECOMMENDATION Ankhtuya Ochirbat, National University of Mongolia,

1 Not Notion on of of Risk Risk Risk societies societies might reduce social