SLIDE 1 Aditya V. Nori
Microsoft Research Cambridge
Joint work with Samuel Drews, Aws Albarghouthi, Loris D’Antoni (University of Wisconsin-Madison)
SLIDE 2 Data is everywhere!
▪ Data analysis is big part of today’s software ▪ Increasingly developers creating and using machine learning models ▪ Increasingly developers working with data that is incomplete, inaccurate, approximate
2
SLIDE 3 ▪ Brin
ingi ging ng da data ta in into pr progr grams ams
- Numerous new sources
- Conversion
▪ Rea
easoni soning ng wit ith da data ta
- What does correctness mean?
Reasoning Data sources
Cleaning, Combining, Statistics, Machine Learning
Action
Visualize, Report, Recommend 3
SLIDE 4
SLIDE 5
SLIDE 6
SLIDE 7
SLIDE 8
How do we prove that a program does not discriminate?
SLIDE 9
Theoreticians
▪ How do we formalise fairness?
Machine learning researchers
▪ How do we learn fair models?
Security/privacy researchers
▪ How do we detect bias in black-box algorithms
Legal scholars
▪ How do we regulate algorithmic decision making?
SLIDE 10 EU GDPR
R (2018) 018)
▪ “data subject’s explicit consent” ▪ “right to explanation” Whit
ite e House se repo port t (2014) 14)
▪ “Powerful algorithms … raise the potential of encoding
scrimin minat ation ion in automat tomated ed decisi cisions.”
White House report (recently …) ▪ “Federal agencies that use AI-based systems to make or provide decision
support for consequential decisions about individuals should take extra care to ensure e the efficac icacy y and fairn irness ess of those systems, based on evidenc dence- based ed ve verifi ificat cation
idation ion.”
SLIDE 11
1)
Fairness as a program property
2)
Automatic proofs of (un)fairness
SLIDE 12
SLIDE 13
SLIDE 14 population model Hoare triple ☺
SLIDE 15
SLIDE 16
SLIDE 17 by definition of conditional probability
SLIDE 18
𝛲: set of all possible execution paths in dec(popModel()) 𝑞(𝜌): probability that 𝜌 ∈ 𝛲
SLIDE 19
𝛲: set of all possible execution paths in dec(popModel()) 𝑞(𝜌): probability that 𝜌 ∈ 𝛲
SLIDE 20
𝛲: set of all possible execution paths in dec(popModel()) 𝑞(𝜌): probability that 𝜌 ∈ 𝛲
What does this mean?
SLIDE 21
SLIDE 22 Each path is uniqu iquel ely represent esented ed by 3 real values
Ide dea
represent paths 𝛲ℎ𝑛 as a region 𝜒 ⊆ ℝ3
and compute:
SLIDE 23
Volume:
𝜒 1 𝑒𝑓 𝑒𝑞 𝑒𝑧
𝜒 𝑞𝑧(𝑧)
SLIDE 24
SLIDE 25
SLIDE 26
SLIDE 27
SLIDE 28
Represent all executions as an SMT formula 𝜒 Compute the weighted volume of 𝜒
Volume of a polytope is #P-hard [Dyer and Frieze, 1988]
SLIDE 29
SLIDE 30
Rectangles are easy!
SLIDE 31 There are inf
nfin init itely ely man any y rec ecta tang ngles les
SLIDE 32
Hyperrectangular decomposition ▪ consider all hyperrectangles in 𝜒 Hyperrectangular sampling ▪ Iteratively sample 𝐼 ⇒ 𝜒
SLIDE 33
SLIDE 34
SLIDE 35
SLIDE 36
SLIDE 37
SLIDE 38 Ide
deal al so soluti ution:
- n: sample with the following objective
SLIDE 39
Approximate densities with step functions ▪ area under a step function is a linear formula
SLIDE 40
Maintains a lower-bound on volume Converges to the actual volume in the limit Works for real closed fields To compute upper-bound, negate formula
SLIDE 41
SLIDE 42
SLIDE 43
SLIDE 44 Automatic proofs of (un)fairness for decision making
programs
Future directions ▪ Scalability – application to real-world programs ▪ Explaining unfairness ▪ Repairing unfair programs
Fai airSquare: Pr Probabilistic Ver erifi fication for
Program Fai airness Aws Albarghouthi, Loris D'Antoni, Samuel Drews, Aditya Nori, OOPSLA '17