Explaining Privacy and Fairness Violations in Data-Driven Systems Matt Fredrikson Carnegie Mellon University
Joint effort Emily Black Gihyuk Ko Klas Leino Anupam Datta Sam Yeom Piotr Mardziel Shayak Sen 2
Data-driven systems are ubiquitous Law Web … Credit Education Healthcare services Enforcement 3
Data-driven systems are opaque Online User data Decisions Advertising System 4
Opacity and privacy …able to identify about 25 products that, when analyzed together, allowed him to assign each shopper a “pregnancy prediction” score. Take a fictional Target shopper who … bought cocoa- butter lotion, a purse large enough to double as a diaper bag, zinc and magnesium supplements and a bright blue rug. There’s, say, an 87 percent chance that she’s pregnant 5
Opacity and fairness Image source: Han Huang, Reuters 6
Inappropriate information use Bo Both p problems ms c can b be s seen a as inappropri riate use of protected inform rmation • Fairness/discrimination race or ge gender for em • Use of ra employmen ent dec ecisi sions • Business necessity exceptions • Privacy • Use of he health h or po political ba backgr ground und for ma marketi ting ng • Exceptions derive from contextual information norms Th This is a type of f bug! 7
Agenda Methods for r dealing with inappropri riate inform rmation use • Detecting when it occurs • Providing diagnostic information to developers • Automatic repair, when possible Re Remaining talk: • Formalize “inappropriate information use” • Show how it applies to classifiers • Generalize to continuous domain • Nonlinear continuous models & applications 8
use via causal influence [Datta, Sen, Zick Oakland’16] Ex Explicit us Example: Credit decisions Age Classifier Decision (uses only income) Income Conclusion: Measures of association not informative 9
Causal intervention Age 44 21 28 63 Classifier Decision (uses only income) Income $90K $20K $100K $10K Replace feature with random values from the population, and examine distribution over outcomes. 10
Challenge: Indirect (proxy) use # years in same job Classifier unpaid mortgage? Decision (targets older income people) … rred and then used Need to determine when information type is inferr 11
Proxy use: a closer look What do we mean by proxy use? Age > 60 1. Explicit use is also proxy use T F N Y 12
Proxy use: a closer look What do we mean by proxy use? yrs in job > 10 F T 1. Explicit use is also proxy use 2. “Inferred use” is proxy use unpaid mortgage? N F T Y N 13
Proxy use: a closer look What do we mean by proxy use? yrs in job > 10 F T 1. Explicit use is also proxy use 2. “Inferred use” is proxy use unpaid mortgage? unpaid mortgage? Inferred values must be influential • F F T T Y N Y N 14
Proxy use: a closer look What do we mean by proxy use? yrs in job > 10 F T 1. Explicit use is also proxy use 2. “Inferred use” is proxy use unpaid mortgage? unpaid mortgage? Inferred values must be influential • F F T T Associations must be two-sided • Y N Y N 15
One- and two-sided associations What happens if we allow one-sided association? zip code Consider this model: Pittsburgh Philadelphia • Uses postal code to determine state • Zip code can predict race Ad #1 Ad #2 • …but not the other way around This is a benign use of information that’s associated with a protected information type 16
Proxy use: a closer look women’s college? What do we mean by proxy use? yes no interested? interested? 1. Explicit use is also proxy use 2. “Inferred use” is proxy use no no yes yes Inferred values must be in influ luential ial • Accept Reject Reject Accept Associations must be tw two-si sided • 3. Output association is unnecessary for proxy use 17
Towards a formal definition: axiomatic basis • (Axiom 1: Explicit use) If random variable Z is an influential input of the model A , then A makes proxy use of Z . • (Axiom 2: Preprocessing) If a model A makes proxy use of Z , and A’(x) = A(x, f(x)) , then A’ also makes proxy use of Z . • Example: A’ infers a protected piece of info given directly to A • (Axiom 3: Dummy) If A’(x,x’) = A(x) for all x and x’ , then A’ has proxy use of Z exactly when A does. • Example: feature never touched by the model. • (Axiom 4: Independence) If Z is independent of the inputs of A , then A does not have proxy use of Z . • Example: model obtains no information about protected type 18
Extensional proxy use axioms are inconsistent Ke Key In Intu tuiti tion: • Pr Preprocessing forces us to preserve proxy use under function composition • But the rest of the model can ca cancel ou out a composed proxy • Let X, Y, Z be pairwise independent random variables, and Y = X ⊕ Z • Then A(Y, Z)= Y ⊕ Z makes proxy use of Z (explicit use axiom) • So does A’(Y, Z, X)= Y ⊕ Z (dummy axiom) • And so does A’’(Z, X) = A’(X ⊕ Z, Z, X) (preprocessing axiom) • But A’’(Z, X) = X ⊕ Z ⊕ Z = X , and X, Z are independent… 19
Syntactic relaxation • We address this with a sy syntactic definition women’s • Composition is tied to how the function is college represented as a pr progr gram true false interested? interested? • Ch Checkin ing for proxy use requires access to program internals no no yes yes offer no offer no offer offer 20
Models as Programs • Expressions that produce a value • No loops or other complexities • But often very large women’s college true false ⟨ exp ⟩ ::= R | True ue | Fa False | var | op( ⟨ exp ⟩ , … , ⟨ exp ⟩ ) interested? interested? else { ⟨ exp ⟩ } | if if ( ⟨ exp ⟩ ) the hen n { ⟨ exp ⟩ } els no no yes yes Operations: offer no offer no offer offer arithmetic operations: +, -, *, etc. boolean connectives: or, and, not, etc. relations: ==, <, ≤, >, etc. 21
Modeling Systems | Probabilistic Semantics Expression semantics: ⟦ exp ⟧ : Instance à Value exp 0 I is a random variable over dataset instances women’s exp 1 college? ⟦ exp ⟧ : I à V true false V is a random variable over the expression’s value exp 4 exp 2 exp 3 interested? interested? exp 5 Joint over input instance ( I ) and expression values ( V i ) for false false true true each expression exp i . exp 6 exp 7 exp 8 exp 9 Pr[ I , V 0 , V 1 , ..., V 9 ] no offer no offer offer offer marginals: Pr[ V 4 = True ue , V 0 = Ad Ad 1 ] conditionals: Pr[ V 4 = True ue | V 0 = Ad Ad 1 ] 22
Program decomposition De Decomposi sition Given a program p , a decomposition (p 1 , X, p 2 ) consists of two programs p 1 , p 2 , and a fresh variable X such that replacing X with p 1 inside p 2 yields p. yrs in job > 10 p 1 p 2 F T yrs in job? unpaid mortgage? unpaid mortgage? N F T T F F T X N Y N Y N 23
Characterizing pr proxies Pr Proxy proxy for Z if Given a decomposition (p 1 , X, p 2 ) and a random variable Z , p 1 is a pr ⟦ p 1 ⟧ (I) is associated with Z. p 2 p 1 women’s X p 1 is a proxy for college “gender = Female” true false interested? interested? false false true true N N Y Y 24
Characterizing use In Influenti tial D Decomp mpositi tion A decomposition (p 1 , X, p 2 ) is influe uential if X can change the outcome of p 2 yrs in job > 10 p 1 p 2 F T yrs in job? unpaid mortgage? unpaid mortgage? N F T T F F T X N Y N Y N 25
Putting it all together Pr Proxy Us Use A program p has pr proxy us use of random variable Z if there exists an influential decomposition (p 1 , X, p 2 ) of p that is a proxy for Z . This is close to our intuition from earlier Formally, it satisfies similar axioms: Dummy and independence axioms remain largely unchanged • Explicit use, preprocessing rely on program decomposition instead of function composition • 26
Quantitative proxy use A decomposition (p 1 , X, p 2 ) is an (ε, δ)-pr proxy us use of Z when • The association between p 1 and Z is ≥ ε, and • p 1 ’s influence in p 2 , ɩ(p 1 , p 2 ) ≥ δ A program has (ε, δ)-proxy use of Z when it admits a decomposition that is an (ε, δ)-proxy use of Z 27
Quantifying decomposition influence p 1 p 2 1. Intervene on p 1 yrs in job? unpaid mortgage? 2. Compare the behavior: no yes no yes With intervention • As the system runs normally • X N N Y 3. Measure divergence: ɩ(p 1 , p 2 ) = E X,X’ [ ⟦ p ⟧ (X) ≠ ⟦ p 2 ⟧ (X, ⟦ p 1 ⟧ (X’)) ] 28
Algorithmics • Does system have an (ε, δ)- • How do we remove (ε, δ)-proxy-use violation? proxy-use of a protected variable? • Naive algorithm • Basic algorithm O(S*N 2 ) • Replace Exp i with a constant • S – # expressions O( 1 ) // any constant • N – # dataset instances O( N * M ) // best constant, M – # possible values 29
Witnesses exp 0 women’s exp 1 college? Exp 0 zip = z 1 or z 3 true false exp 4 exp 2 true false exp 3 Us Using Witn tnesses interested? interested? exp 5 exp 2 exp 1 no offer offer false false true true De Demonstration of vi violation n in in the system exp 6 exp 7 exp 8 exp 9 N N Y Y Lo Localize e where scru rutiny/human eyeballs need to be ap applie lied Determ rmine what repair r should be applied 30
Recommend
More recommend