fr train
play

FR-Train: A Mutual Information-based Fair and Robust Training Yuji - PowerPoint PPT Presentation

FR-Train: A Mutual Information-based Fair and Robust Training Yuji Roh, Kangwook Lee, Steven E. Whang, Changho Suh Yuji Roh , Data Intelligence Lab, KAIST Trustworthy AI AI has significant potential to help solve challenging problems,


  1. FR-Train: A Mutual Information-based Fair and Robust Training Yuji Roh, Kangwook Lee, Steven E. Whang, Changho Suh Yuji Roh , Data Intelligence Lab, KAIST

  2. Trustworthy AI “AI has significant potential to help solve challenging problems, including by advancing medicine, understanding language, and fueling scientific discovery. To realize that potential, it’s critical that AI is used and developed responsibly . ” - AI, 2020 “Moving forward, “build for performance” will not suffice as an AI design paradigm. We must learn how to build, evaluate and monitor for trust .” - Trusting AI, 2020 2

  3. Trustworthy AI Fairness Robustness Value Transparency Explainability Alignment & Accountability 3

  4. Trustworthy AI Data-related Fairness Robustness Value Transparency Explainability Alignment & Accountability 4

  5. Two approaches ⚬ Two-step approach: Sanitize data -> Fair training Downside: very difficult to “decouple" poisoning and bias Poisoned Sanitization Fair Training Dataset 5

  6. Two approaches ⚬ Two-step approach: Sanitize data -> Fair training Downside: very difficult to “decouple" poisoning and bias Poisoned Sanitization Fair Training Dataset ⚬ Holistic approach: Fair & Robust training Performing the two operations along with model training results in much better performance Poisoned Fair and Robust Dataset Training 6

  7. Two approaches ⚬ Two-step approach: Sanitize data -> Fair training Downside: very difficult to “decouple" poisoning and bias Poisoned Sanitization Fair Training Dataset FR-Train ⚬ Holistic approach: Fair & Robust training Performing the two operations along with model training results in much better performance Poisoned Fair and Robust Dataset Training 7

  8. 01 Motivation 02 FR-Train 03 Experiments 04 Takeaways 8

  9. 01 Motivation 02 FR-Train 03 Experiments 04 Takeaways 9

  10. Trustworthy AI Data-related Fairness Robustness Value Transparency Explainability Alignment & Accountability 10

  11. Fairness Feature Label Group attribute ⚬ A machine learning model learns bias and discriminations in the data Predicted label ⚬ The fairness of a (binary) classifier can be defined in various ways: Demographic Parity Equalized Odds ( ⇔ Disparate Impact) ⚬ The level of fairness can be measured as a ratio or difference 11

  12. Fairness Feature Label Group attribute ⚬ A machine learning model learns bias and discriminations in the data Predicted label ⚬ The fairness of a (binary) classifier can be defined in various ways: Demographic Parity Equalized Odds ( ⇔ Disparate Impact ) In this talk ⚬ The level of fairness can be measured as a ratio or difference 12

  13. Robustness ⚬ Datasets are easy to publish nowadays, but as a result easy to “poison" as well - Poison = noisy, subjective, or even adversarial - Attacker’s goal : Increase the test loss by poisoning data - Defender’s goal : Train a classifier with small test loss ⚬ Already a serious issue in federated learning 13

  14. Fairness + Robustness What happens if we just apply a fairness-aware algorithm on a poisoned dataset? - May result in a strictly suboptimal (accuracy, fairness) than vanilla training 14

  15. Motivating example Clean Vanilla classifier A, B : Sensitive groups (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X 15

  16. Motivating example Clean Vanilla classifier A, B : Sensitive groups (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X 16

  17. Motivating example Clean Fair classifier Vanilla classifier A, B : Sensitive groups (Acc, DI) = (0.8, 1) (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X 17

  18. Motivating example Clean Fair classifier Vanilla classifier A, B : Sensitive groups (Acc, DI) = (0.8, 1) (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X Poisoned A B A A B A B B A B X 18

  19. Motivating example Clean Fair classifier Vanilla classifier A, B : Sensitive groups (Acc, DI) = (0.8, 1) (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X Poisoned Vanilla classifier Acc : ↓ Acc poi = 0.9 DI : ↑ (Acc clean , DI) = (0.9, 0.67) A B A A B A B B A B X 19

  20. Motivating example Clean Fair classifier Vanilla classifier A, B : Sensitive groups (Acc, DI) = (0.8, 1) (Acc, DI) = (1, 0.5) : Positive label : Negative label A B A A B A B B A B X Poisoned Vanilla classifier Fair classifier Acc : ↓ Acc poi = 0.9 Acc poi = 0.8 DI : 一 (Acc clean , DI) = (0.9, 0.67) (Acc clean , DI) = (0.6, 1) Suboptimal A B A A B A B B A B X 20

  21. Fairness + Robustness What happens if we just apply a fairness-aware algorithm on a poisoned dataset? - May result in a strictly suboptimal (accuracy, fairness) than vanilla training We need a holistic approach to fair and robust training. FR-Train! 21

  22. 01 Motivation 02 FR-Train 03 Experiments 04 Takeaways 22

  23. FR-Train - Main contributions ⚬ FR-Train is a holistic framework for fair and robust training ⚬ Extends a state-of-the-art fairness-only method called Adversarial Debiasing - Provides a novel mutual information (MI)-based interpretation of adversarial learning - Adds a robust discriminator that uses a small clean validation set for data sanitization ⚬ We also propose crowdsourcing methods for constructing a clean validation set 23

  24. FR-Train “Discriminator Permit Male “Classifier” for Fairness” or or Giving loans Deny Female Distinguish the gender 24

  25. FR-Train “Discriminator Permit Male “Classifier” for Fairness” or or Giving loans Deny Female Distinguish the gender Poisoned training set + Predicted label affected by poisoning “Discriminator for Robustness” Distinguish whether poisoned or clean 25

  26. FR-Train “Discriminator Permit Male “Classifier” for Fairness” or or Giving loans Deny Female Distinguish the gender Poisoned training set + Predicted label affected by poisoning “Discriminator Poisoned set for Robustness” or Distinguish whether + Clean true label Clean val. set poisoned or clean Clean validation set 26

  27. FR-Train “Discriminator Permit Male “Classifier” for Fairness” or or Giving loans Deny Female Distinguish the gender Poisoned training set + Predicted label affected by poisoning “Discriminator Poisoned set for Robustness” or Distinguish whether + Clean true label Clean val. set poisoned or clean Clean validation set Constructed with crowdsourcing 27

  28. Mutual information-based interpretation Softmax Disc. Classifier Fairness Disc. Robustness Theorem 1 - Fairness Theorem 2 - Robustness 28

  29. Mutual information-based interpretation Softmax Disc. Classifier Fairness Disc. Robustness Theorem 1 - Fairness Theorem 2 - Robustness 29

  30. Mutual information-based interpretation Softmax Disc. Classifier Fairness Disc. Robustness Theorem 1 - Fairness Theorem 2 - Robustness 30

  31. Mutual information-based interpretation Softmax Disc. Classifier Fairness Disc. Robustness See paper for proofs Theorem 1 - Fairness Theorem 2 - Robustness 31

  32. Mutual information-based interpretation Softmax Disc. Classifier Fairness Reweight examples Disc. Robustness Theorem 1 - Fairness Theorem 2 - Robustness 32

  33. 01 Motivation 02 FR-Train 03 Experiments 04 Takeaways 33

  34. Experimental setting ⚬ Synthetic data - Poisoning (label flipping): 10% of training data - Validation set: 10% of training data ⚬ Real data (results in paper) - COMPAS : Predict recidivism in two years for criminals - AdultCensus : Predict whether annual income > $50 K or not - Poisoning: 10% of training data - Validation set: 5% of training data 34

  35. Synthetic data results Fair-only algorithms Two-step approach : Data sanitization + Fair training Logistic regression Data sanitization using clean val. set 35

  36. Synthetic data results Fair-only algorithms Low accuracy Two-step approach : Data sanitization + Fair training Logistic regression Data sanitization using clean val. set 36

  37. Synthetic data results Fair-only algorithms Two-step approach : Data sanitization + Fair training Logistic regression Low fairness Data sanitization using clean val. set 37

  38. Synthetic data results Fair-only algorithms Two-step approach : Data sanitization Also low accuracy + Fair training Logistic regression Data sanitization using clean val. set 38

  39. Synthetic data results Fair-only algorithms Two-step approach : Data sanitization + Fair training Logistic regression Data sanitization Holistic approach = using clean val. set high fairness & accuracy 39

  40. 01 Motivation 02 FR-Train 03 Experiments 04 Takeaways 40

Recommend


More recommend