CMSC 20370/30370 Winter 2020 Technology Considered Harmful? Case Study: Facial Recognition and Bias Mar 6, 2020
Quiz Time (5-7 minutes). Quiz on Facial recognition and bias Principles of Good Design
Administrivia • GP4 video due on Monday for video screening • Next week video showcases will be in: – Monday: Room 390 – Wednesday: Room 298 • Schedule of groups is online • Please send us links to your videos ahead of the class session so we can load them all on one laptop
Today’s Agenda • Is technology considered harmful? • Case Study: Facial recognition and Bias
Case Study: Facial Recognition and Bias • Looked at existing face data sets to see composition of male vs female faces, light skinned vs dark skinned • Evaluation 3 commercial classifiers and found they all perform worse on darker skinned females • Provide implications for fairness in machine learning and assessing algorithms accuracy
Fairness and Recidivism ProPublica 2016 study: • Rode unlocked bike and scooter with a friend down street - $80 • worth Prater $83 dollar shoplifting from Walmart • Rode misdemeanors as juvenile • Prater already convicted of armed robbery and attempted armed • robbery
Bail based on risk score… • 2016 ProPublica study • COMPAS, an algorithm used for recidivism prediction • Produces much higher false positive for black people than white people • Recidivism = likelihood of a criminal to re-offend • Examined 7000 risk scores in Florida in 2013/2014
Hiring and fairness • 2018 study found that Xing, similar to LinkedIn, algorithm exhibits bias • Ranked more qualified female candidates lower than qualified male candidates
Case Study: Facial Recognition and Bias – Skin Tone Map
Created a data set of parliamentarians
• Evaluated 3 commercial gender classifiers from Microsoft, IBM, and Face++ • Looked at overall accuracy for gender classification • Then broke it out by skin tone • Found that all three perform worse for darker skinned females • Also perform worse on darker skin than lighter skin
Gender classification Performance
What if this is due to image quality? • They also wanted to see if this was just because images from European countries were higher res + better pose • Did another analysis on South African data since skin tone range is high • Also there is more balanced set of darker skin tones in South African subset • Found the trends remain …
Why does this bias happen? • Could be the training data used for the algorithms • Could be fewer instances of darker skinned people in training set • Darker skin could be correlated with facial characteristics not well represented in data set for instance
Fairness in Machine Learning • Lots of definitions of fairness – Such as being blind to “protected” attributes such as race, gender – Equalizing odds – Individual fairness etc • Watch 21 definitions of fairness in ML by Arvind Narayanan @ Princeton • https://www.youtube.com/watch?v=wqam rPkF5kk
Other sources of bias • Skewed sample, confirmation bias over time • Uneven data set • Limited features for minority groups • Proxies – Don’t use race but other demographic information can be correlated with race • Tainted examples – E.g. Word2Vec word embeddings trained on Google news – Associates words “computer programmer” with “man” and “homemaker” with women
Case Study Discussion Points • Can’t rely on one metric for accuracy • Have to break out performance metrics by subsets of classification • Facial recognition algorithms give confidence scores in classification – Need to provide other metrics such as true positive rate and false positive rate • Image quality such as pose and illumination can confound results (and algorithms)
HCI uses ML
But how else can HCI help to make inclusive technology? • Its Machine Learning but HCI has a part to play • How to visualize different forms of bias in ML • Understand how to help overcome bias in different parts of the ML pipeline
HCI, Ethics, and ML • How are data sets used over time? • Where are images gathered from? • How can we help people identify biases? • How can we help people document how they create a ML pipeline?
Let’s consider if technology is harmful or not • Break up into groups of 4-5 • Discuss the following questions (10 minutes) • Do you think we can ever achieve fairness? • What do you think about technology and amplification of human intent? • How can you balance using technology and achieving inclusivity in design? • What are the top three things you learned in this class? • Let’s share a few instances next
Summary • Increasing reliance on algorithms for decisions that impact humans in some way – Hiring, bail, criminal investigations, facial recognition • Have to think of ways to incorporate fairness into machine learning problems • HCI has a part to play to make ML more fair and inclusive • Technology is useful but can also be unintentionally harmful • Remember that at the end of the day technology only amplifies what is already there
Coming up… • GP 4 video showcase on Monday and Wednesday • GP 4 reports due in 2 weeks
Get in touch: Office hours: Fridays 2-4pm (Sign up in advance) or by appointment JCL 355 Email: marshini@uchicago.edu
Recommend
More recommend