FR-Train: A Mutual Information-based Fair and Robust Training Yuji Roh, Kangwook Lee, Steven E. Whang, Changho Suh Yuji Roh , Data Intelligence Lab, KAIST
Trustworthy AI “AI has significant potential to help solve challenging problems, including by advancing medicine, understanding language, and fueling scientific discovery. To realize that potential, it’s critical that AI is used and developed responsibly . ” - AI, 2020 “Moving forward, “build for performance” will not suffice as an AI design paradigm. We must learn how to build, evaluate and monitor for trust .” - Trusting AI, 2020 2
Trustworthy AI Fairness Robustness Value Transparency Explainability Alignment & Accountability 3
Trustworthy AI Data-related Fairness Robustness Value Transparency Explainability Alignment & Accountability 4
Two approaches ⚬ Two-step approach: Sanitize data -> Fair training Downside: very difficult to “decouple" poisoning and bias Poisoned Sanitization Fair Training Dataset 5
Two approaches ⚬ Two-step approach: Sanitize data -> Fair training Downside: very difficult to “decouple" poisoning and bias Poisoned Sanitization Fair Training Dataset ⚬ Holistic approach: Fair & Robust training Performing the two operations along with model training results in much better performance Poisoned Fair and Robust Dataset Training 6
Two approaches ⚬ Two-step approach: Sanitize data -> Fair training Downside: very difficult to “decouple" poisoning and bias Poisoned Sanitization Fair Training Dataset FR-Train ⚬ Holistic approach: Fair & Robust training Performing the two operations along with model training results in much better performance Poisoned Fair and Robust Dataset Training 7
01 Motivation 02 FR-Train 03 Experiments 04 Takeaways 8
01 Motivation 02 FR-Train 03 Experiments 04 Takeaways 9
Trustworthy AI Data-related Fairness Robustness Value Transparency Explainability Alignment & Accountability 10
Fairness Feature Label Group attribute ⚬ A machine learning model learns bias and discriminations in the data Predicted label ⚬ The fairness of a (binary) classifier can be defined in various ways: Demographic Parity Equalized Odds ( ⇔ Disparate Impact) ⚬ The level of fairness can be measured as a ratio or difference 11
Fairness Feature Label Group attribute ⚬ A machine learning model learns bias and discriminations in the data Predicted label ⚬ The fairness of a (binary) classifier can be defined in various ways: Demographic Parity Equalized Odds ( ⇔ Disparate Impact ) In this talk ⚬ The level of fairness can be measured as a ratio or difference 12
Robustness ⚬ Datasets are easy to publish nowadays, but as a result easy to “poison" as well - Poison = noisy, subjective, or even adversarial - Attacker’s goal : Increase the test loss by poisoning data - Defender’s goal : Train a classifier with small test loss ⚬ Already a serious issue in federated learning 13
Fairness + Robustness What happens if we just apply a fairness-aware algorithm on a poisoned dataset? - May result in a strictly suboptimal (accuracy, fairness) than vanilla training 14
Motivating example Clean Vanilla classifier A, B : Sensitive groups (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X 15
Motivating example Clean Vanilla classifier A, B : Sensitive groups (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X 16
Motivating example Clean Fair classifier Vanilla classifier A, B : Sensitive groups (Acc, DI) = (0.8, 1) (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X 17
Motivating example Clean Fair classifier Vanilla classifier A, B : Sensitive groups (Acc, DI) = (0.8, 1) (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X Poisoned A B A A B A B B A B X 18
Motivating example Clean Fair classifier Vanilla classifier A, B : Sensitive groups (Acc, DI) = (0.8, 1) (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X Poisoned Vanilla classifier Acc : ↓ Acc poi = 0.9 DI : ↑ (Acc clean , DI) = (0.9, 0.67) A B A A B A B B A B X 19
Motivating example Clean Fair classifier Vanilla classifier A, B : Sensitive groups (Acc, DI) = (0.8, 1) (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X Poisoned Vanilla classifier Fair classifier Acc : ↓ Acc poi = 0.9 Acc poi = 0.8 DI : 一 (Acc clean , DI) = (0.9, 0.67) (Acc clean , DI) = (0.6, 1) Suboptimal A B A A B A B B A B X 20
Fairness + Robustness What happens if we just apply a fairness-aware algorithm on a poisoned dataset? - May result in a strictly suboptimal (accuracy, fairness) than vanilla training We need a holistic approach to fair and robust training. FR-Train! 21
01 Motivation 02 FR-Train 03 Experiments 04 Takeaways 22
FR-Train - Main contributions ⚬ FR-Train is a holistic framework for fair and robust training ⚬ Extends a state-of-the-art fairness-only method called Adversarial Debiasing - Provides a novel mutual information (MI)-based interpretation of adversarial learning - Adds a robust discriminator that uses a small clean validation set for data sanitization ⚬ We also propose crowdsourcing methods for constructing a clean validation set 23
FR-Train “Discriminator Permit Male “Classifier” for Fairness” or or Giving loans Deny Female Distinguish the gender 24
FR-Train “Discriminator Permit Male “Classifier” for Fairness” or or Giving loans Deny Female Distinguish the gender Poisoned training set + Predicted label affected by poisoning “Discriminator for Robustness” Distinguish whether poisoned or clean 25
FR-Train “Discriminator Permit Male “Classifier” for Fairness” or or Giving loans Deny Female Distinguish the gender Poisoned training set + Predicted label affected by poisoning “Discriminator Poisoned set for Robustness” or Distinguish whether + Clean true label Clean val. set poisoned or clean Clean validation set 26
FR-Train “Discriminator Permit Male “Classifier” for Fairness” or or Giving loans Deny Female Distinguish the gender Poisoned training set + Predicted label affected by poisoning “Discriminator Poisoned set for Robustness” or Distinguish whether + Clean true label Clean val. set poisoned or clean Clean validation set Constructed with crowdsourcing 27
Mutual information-based interpretation Softmax Disc. Classifier Fairness Disc. Robustness Theorem 1 - Fairness Theorem 2 - Robustness 28
Mutual information-based interpretation Softmax Disc. Classifier Fairness Disc. Robustness Theorem 1 - Fairness Theorem 2 - Robustness 29
Mutual information-based interpretation Softmax Disc. Classifier Fairness Disc. Robustness Theorem 1 - Fairness Theorem 2 - Robustness 30
Mutual information-based interpretation Softmax Disc. Classifier Fairness Disc. Robustness See paper for proofs Theorem 1 - Fairness Theorem 2 - Robustness 31
Mutual information-based interpretation Softmax Disc. Classifier Fairness Reweight examples Disc. Robustness Theorem 1 - Fairness Theorem 2 - Robustness 32
01 Motivation 02 FR-Train 03 Experiments 04 Takeaways 33
Experimental setting ⚬ Synthetic data - Poisoning (label flipping): 10% of training data - Validation set: 10% of training data ⚬ Real data (results in paper) - COMPAS : Predict recidivism in two years for criminals - AdultCensus : Predict whether annual income > $50 K or not - Poisoning: 10% of training data - Validation set: 5% of training data 34
Synthetic data results Fair-only algorithms Two-step approach : Data sanitization + Fair training Logistic regression Data sanitization using clean val. set 35
Synthetic data results Fair-only algorithms Low accuracy Two-step approach : Data sanitization + Fair training Logistic regression Data sanitization using clean val. set 36
Synthetic data results Fair-only algorithms Two-step approach : Data sanitization + Fair training Logistic regression Low fairness Data sanitization using clean val. set 37
Synthetic data results Fair-only algorithms Two-step approach : Data sanitization Also low accuracy + Fair training Logistic regression Data sanitization using clean val. set 38
Synthetic data results Fair-only algorithms Two-step approach : Data sanitization + Fair training Logistic regression Data sanitization Holistic approach = using clean val. set high fairness & accuracy 39
01 Motivation 02 FR-Train 03 Experiments 04 Takeaways 40
Recommend
More recommend