Aditya V. Nori Microsoft Research Cambridge Joint work with Samuel Drews, Aws Albarghouthi, Loris D’Antoni (University of Wisconsin-Madison)
Data is everywhere! ▪ Data analysis is big part of today’s software ▪ Increasingly developers creating and using machine learning models ▪ Increasingly developers working with data that is incomplete, inaccurate, approximate 2
▪ Brin ingi ging ng da data ta in into pr progr grams ams o Numerous new sources Action Data sources Reasoning o Conversion Visualize, Report, Cleaning, Recommend Combining, Statistics, Machine Learning ▪ Rea easoni soning ng wit ith da data ta o What does correctness mean? 3
How do we prove that a program does not discriminate?
Theoreticians ▪ How do we formalise fairness? Machine learning researchers ▪ How do we learn fair models? Security/privacy researchers ▪ How do we detect bias in black-box algorithms Legal scholars ▪ How do we regulate algorithmic decision making?
EU GDPR R (2018) 018) ▪ “data subject’s explicit consent” ▪ “right to explanation” Whit ite e House se repo port t (2014) 14) ▪ “Powerful algorithms … raise the potential of encoding oding discri scrimin minat ation ion in automat tomated ed decisi cisions .” White House report (recently …) ▪ “Federal agencies that use AI -based systems to make or provide decision support for consequential decisions about individuals should take extra care to ensure e the efficac icacy y and fairn irness ess of those systems, based on evidenc dence- based ed ve verifi ificat cation on and validat idation ion .”
Fairness as a program property 1) Automatic proofs of (un)fairness 2)
population model Hoare triple ☺
by definition of conditional probability
𝛲 : set of all possible execution paths in dec(popModel()) 𝑞(𝜌) : probability that 𝜌 ∈ 𝛲
𝛲 : set of all possible execution paths in dec(popModel()) 𝑞(𝜌) : probability that 𝜌 ∈ 𝛲
𝛲 : set of all possible execution paths in dec(popModel()) 𝑞(𝜌) : probability that 𝜌 ∈ 𝛲 What does this mean?
Each path is uniqu iquel ely represent esented ed by 3 real values Ide dea represent paths 𝛲 ℎ𝑛 as a region 𝜒 ⊆ ℝ 3 and compute:
Volume: 𝜒 1 𝑒𝑓 𝑒𝑞 𝑒𝑧 𝜒 𝑞 𝑧 (𝑧)
Represent all executions as an SMT formula 𝜒 Compute the weighted volume of 𝜒 Volume of a polytope is #P-hard [ Dyer and Frieze, 1988 ]
Rectangles are easy!
There are inf nfin init itely ely man any y rec ecta tang ngles les
Hyperrectangular decomposition ▪ consider all hyperrectangles in 𝜒 Hyperrectangular sampling ▪ Iteratively sample 𝐼 ⇒ 𝜒
Ide deal al so soluti ution: on: sample with the following objective
Approximate densities with step functions ▪ area under a step function is a linear formula
Maintains a lower-bound on volume Converges to the actual volume in the limit Works for real closed fields To compute upper-bound, negate formula
Automatic proofs of (un)fairness for decision making programs Future directions ▪ Scalability – application to real-world programs ▪ Explaining unfairness ▪ Repairing unfair programs Fai airSquare: Pr Probabilistic Ver erifi fication for or Pr Program Fai airness Aws Albarghouthi, Loris D'Antoni, Samuel Drews, Aditya Nori, OOPSLA '17
Recommend
More recommend