fairness in ml 2 equal opportunity and odds
play

Fairness in ML 2: Equal opportunity and odds Privacy & Fairness - PowerPoint PPT Presentation

Fairness in ML 2: Equal opportunity and odds Privacy & Fairness in Data Science CS848 Fall 2019 Slides adapted from https://fairmlclass.github.io/4.html 2 Outline Recap: Disparity impact Issues with Disparate Impact


  1. Fairness in ML 2: Equal opportunity and odds Privacy & Fairness in Data Science CS848 Fall 2019 Slides adapted from https://fairmlclass.github.io/4.html

  2. 2 Outline • Recap: Disparity impact – Issues with Disparate Impact • Observational measure of fairness – Equal opportunity and Equalized odds – Predictive Value Parity – Tradeoff • Achieving Equalized Odds – Binary Classifier

  3. Recap: Disparate Impact • Let D = (X, Y, C ) be a labeled data set, where X = 0 means protected, C = 1 is the positive class (e.g., admitted), and Y is everything else. • We say that a classifier f has disparate impact (DI) of 𝜐 (0 < 𝜐 < 1) if: Pr 𝑔 𝑍 = 1 𝑌 = 0) Pr(𝑔 𝑍 = 1 | 𝑌 = 1) ≤ 𝜐 that is, if the protected class is positively classified less than 𝜐 times as often as the unprotected class. (legally, 𝜐 = 0.8 is common).

  4. 4 Recap: Disparate Impact X (protected attribute) Y (features) f(Y) (prediction) X1 … … … … Race Bail 0 … 0 1 … 1 1 (Y) 1 … 1 0 … 1 0 (N) 1 … 1 0 … 0 0 (N) protected group .. … … … … … … 𝑄 016 𝐹 = Pr[𝐹|𝑌 = 0] 𝑄 012 𝐹 = Pr[𝐹|𝑌 = 1]

  5. 5 Recap: Disparate Impact X (protected attribute) Y (features) f(Y) (prediction) X1 … … … … Race Bail 0 … 0 1 … 1 1 (Y) 1 … 1 0 … 1 0 (N) 1 … 1 0 … 0 0 (N) protected group .. … … … … … … 𝑄 016 𝑔 𝑍 = 1 Classifier f has DI of 𝜐 : 012 [𝑔 𝑍 = 1] ≤ 𝜐 𝑄

  6. Demographic parity (or the reverse of disparate impact) • Definition. Classifier f satisfies demographic parity if f is independent of X • When f is binary 0/1-variables, this means, for all groups 𝑦 and 𝑦′, 𝑄 01= 𝑔 𝑍 = 1 = 𝑄 01= > 𝑔 𝑍 = 1 • Approximate versions: ? @AB C D 12 ? @AB> [C D 12] ≥ 1 − 𝜗 – 01= 𝑔 𝑍 = 1 − 𝑄 01= > 𝑔 𝑍 = 1 ≤ 𝜗 𝑄 –

  7. 7 Demographic parity Issues C = 1 X = 1 X = 0

  8. 8 Demographic parity Issues ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ X = 1 ✔ ✔ ✔ ✔ X = 0 • Does not seem “fair” to allow random performance on X = 0 • Perfect classification is impossible

  9. 9 Outline • Recap: Disparity impact – Issues with Disparate Impact • Observational measure of fairness – Equal opportunity and Equalized odds – Predictive Value Parity – Tradeoff • Achieving Equalized Odds – Binary Classifier

  10. 10 True Positive Parity (TPP) (or equal opportunity) • Assume classifier f and label C are binary 0/1-variables • Definition. Classifier f satisfies true positive parity if for all groups 𝑦 and 𝑦′, 𝑄 01= 𝑔 𝑍 = 1|𝐷 = 1 = 𝑄 01= > 𝑔 𝑍 = 1|𝐷 = 1 • When positive outcome (1) is desirable • Equivalently, primary harm is due to false negatives – Deny bail when person will not recidivate

  11. 11 TPP ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ X = 1 ✔ ✔ X = 0 • Forces similar performance on C = 1

  12. 12 False Positive Parity (FPP) • Assume classifier f and label C are binary 0/1-variables • Definition. Classifier f satisfies false positive parity if for all groups 𝑦 and 𝑦′, 𝑄 01= 𝑔 𝑍 = 1|𝐷 = 0 = 𝑄 01= > 𝑔 𝑍 = 1|𝐷 = 0 • TPP & FPP: Equalized Odds , or Positive Rate Parity f satisfies equalized odds if f is conditionally independent of X given C .

  13. 13 Positive Rate Parity ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ X = 1 ✔ ✔ ✔ X = 0 𝑄 012 𝑔(𝑍) = 1 𝐷 = 1] =? 𝑄 012 𝑔(𝑍) = 1 𝐷 = 0] =? 𝑄 016 𝑔(𝑍) = 1 𝐷 = 1] =? 𝑄 016 𝑔(𝑍) = 1 𝐷 = 0] =?

  14. 14 Positive Rate Parity ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ X = 1 ✔ ✔ ✔ X = 0 𝑄 012 𝑔(𝑍) = 1 𝐷 = 1] = 1 𝑄 012 𝑔(𝑍) = 1 𝐷 = 0] = 1/2 𝑄 016 𝑔(𝑍) = 1 𝐷 = 1] = 1 𝑄 016 𝑔(𝑍) = 1 𝐷 = 0] = 1/2

  15. 15 Outline • Recap: Disparity impact – Issues with Disparate Impact • Observational measure of fairness – Equal opportunity and Equalized odds – Predictive Value Parity – Tradeoff • Achieving Equalized Odds – Binary Classifier

  16. 16 Predictive Value Parity • Assume classifier f and label C are binary 0/1-variables • Definition. Classifier f satisfies – positive predictive value parity if if for all groups 𝑦 and 𝑦′, 𝑄 01= 𝐷 = 1|𝑔 𝑍 = 1 = 𝑄 01= > 𝐷 = 1|𝑔 𝑍 = 1 – negative predictive value parity if if for all groups 𝑦 and 𝑦′, 𝑄 01= 𝐷 = 1|𝑔 𝑍 = 0 = 𝑄 01= > 𝐷 = 1|𝑔 𝑍 = 0 – predictive value parity if satisfies both of the above. • Equalized chance of success given acceptance.

  17. 17 Predictive Value Parity ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ X = 1 ✔ ✔ ✔ X = 0 𝑄 012 𝐷 = 1 𝑔(𝑍) = 1] = 𝑄 012 𝐷 = 1 𝑔(𝑍) = 0] = 𝑄 016 𝐷 = 1 𝑔(𝑍) = 1] = 𝑄 016 𝐷 = 1 𝑔(𝑍) = 0] =

  18. 18 Predictive Value Parity ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ X = 1 ✔ ✔ ✔ X = 0 𝑄 012 𝐷 = 1 𝑔(𝑍) = 1] = 8/9 𝑄 012 𝐷 = 1 𝑔(𝑍) = 0] = 0 𝑄 016 𝐷 = 1 𝑔(𝑍) = 1] = 1/3 𝑄 016 𝐷 = 1 𝑔(𝑍) = 0] = 0

  19. 19 Trade-off • Proposition. Assume differing base rates and an imperfect classifier 𝑔 ≠ 𝐷 . Then either – Positive rate parity fails, or – Predictive value parity fails. • We will look at a similar result later in the course due to Kleinberg, Mullainathan and Raghavan (2016)

  20. 20 Intuition • So far, predictor is perfect. • Let's introduce an error.

  21. 21 Intuition • But this doesn't satisfy positive rate parity! • Let's fix that!

  22. 22 Intuition • Satisfies positive rate parity!

  23. 23 Intuition • Does not satisfy predictive value parity!

  24. 24

  25. 25 Outline • Recap: Disparity impact – Issues with Disparate Impact • Observational measure of fairness – Equal opportunity and Equalized odds – Predictive Value Parity – Tradeoff • Achieving Equalized Odds – Binary Classifier

  26. 26 Equalized Odds f satisfies equalized odds if f is conditionally independent of protected X given outcome C. P be any classifier out of the existing training • Let 𝑔 pipeline for the problem at hand that fails to satisfy equalized odds

  27. P that does not satisfy 27 Classifier 𝑔 equalized odds ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ X = 1 ✔ ✔ X = 0 P(𝑍) = 1 𝐷 = 0] ≠ 𝑄 P(𝑍) = 1 𝐷 = 0] 𝑄 012 𝑔 016 𝑔

  28. � � 28 Derived Classifier Q is derived from 𝒈 S and the • A new classifier 𝑔 protected attribute X Q is independent of features Y conditional on ( 𝑔 P ,X ) – 𝑔 Q 𝑍 = 𝑑|𝐷 = 1 is – 𝑄 012 𝑔 P 𝑍 = 𝑑 V , 𝑌 = 1 ⋅ 𝑄 P 𝑍 = 𝑑′|𝐷 = 1 ∑ 𝑄 𝑑|𝑔 012 𝑔 Y > ∈{6,2} Q 𝑍 = 𝑑|𝐷 = 0 is – 𝑄 012 𝑔 P 𝑍 = 𝑑 V , 𝑌 = 1 ⋅ 𝑄 P 𝑍 = 𝑑′|𝐷 = 0 ∑ 𝑄 𝑑|𝑔 012 𝑔 Y > ∈{6,2} Q 𝑍 = 𝑑|𝐷 = 1 – 𝑄 016 𝑔 X=1 c'=0 c’=1 X=0 c’=0 c’=1 Q 𝑍 = 𝑑|𝐷 = 0 c=0 p0 p1 c=0 p2 p3 – 𝑄 016 𝑔 c=1 1-p0 1-p1 c=1 1-p2 1-p3

  29. 29 Derived Classifier 1 Q : • Options for 𝑔 1.0 Q(𝑍) = 1 𝐷 = 1] Q = 𝑔 P ( + ) – 𝑔 + Q = 1 − 𝑔 P ( x ) – 𝑔 0.5 Q = (1,1) – 𝑔 x Q = (0,0) – 𝑔 012 𝑔 – Or some randomized 0.0 𝑄 o combination of these 0.0 0.5 1.0 Q(𝑍) = 1 𝐷 = 0] 𝑄 012 𝑔 Q is in the 𝐷 enclosed region

  30. 30 Derived Classifier Q is in this region Q(𝑍) = 1 𝐷 = 1] 𝑔 for X = 0 Q is in this region 𝑔 for X = 1 0 𝑔 𝑄 Q(𝑍) = 1 𝐷 = 0] 𝑄 0 𝑔

  31. 31 Derived Classifier • Loss minimization: 𝑚: 0,1 _ → 𝑆 Q 𝑍 = 𝑑 when the – Indicate the loss of predicting 𝑔 correct label is 𝑑′′ Q(𝑍), 𝐷 ] s.t. • Minimize the expected loss E [𝑚 𝑔 Q is derived – 𝑔 Q satisfies equalized odds – 𝑔 Q 𝑍 = 1|𝐷 = 1 = 𝑄 Q 𝑍 = 1|𝐷 = 1 • 𝑄 012 𝑔 016 𝑔 Q 𝑍 = 1|𝐷 = 0 = 𝑄 Q 𝑍 = 1|𝐷 = 0 • 𝑄 012 𝑔 016 𝑔

  32. � 32 Derived Classifier Q 𝑍 , 𝐷 Q(𝑍) = 𝑑, 𝐷 = 𝑑 VV ] 𝑚 𝑑, 𝑑 V ′ Pr[𝑔 = ∑ • E 𝑚 𝑔 Y,Y >> ∈{6,2} Q = 𝑑, 𝐷 = 𝑑′′] • Pr[𝑔 Q = 𝑑, 𝐷 = 𝑑′′ 𝑔 Q = 𝑔 P Pr 𝑔 Q = 𝑔 P = Pr 𝑔 Q = 𝑑, 𝐷 = 𝑑′′ 𝑔 Q ≠ 𝑔 P Pr 𝑔 Q ≠ 𝑔 P +Pr 𝑔 P = 𝑑, 𝐷 = 𝑑′′ Pr 𝑔 Q = 𝑔 P = Pr 𝑔 P = 1 − 𝑑, 𝐷 = 𝑑′′ Pr 𝑔 Q ≠ 𝑔 P +Pr 𝑔 P 𝑔 Based on the joint distribution X=1 c'=0 c’=1 X=0 c'=0 c’=1 c=0 p0 p1 c=0 p2 p3 Q 𝑔 c=1 1-p0 1-p1 c=1 1-p2 1-p3

  33. 33 Summary: Multiple fairness measures • Demographic parity or disparate impact – Pro: Used in the law – Con: Perfect classification is impossible – Achieved by modifying data • Equal odds/ opportunity – Pro: Perfect classification is possible – Con: Different groups can get different rates of positive prediction – Achieved by post processing the classifier

  34. 34 Summary: Multiple fairness measures • Equal odds/opportunity – Different groups may be treated unequally – Maybe due to the problem – Maybe due to bias in the dataset • While demographic parity seems like a good fairness goal for the society, … Equal odds/opportunity seems to be measuring whether an algorithm is fair (independent of other factors like input data).

Recommend


More recommend