“Why is ‘Chicago’ deceptive?”: Towards Building Model-Driven Tutorials for Humans Vivian Lai, Han Liu, and Chenhao Tan @vivwylai | @HanLiuAI | @ChenhaoTan University of Colorado Boulder machineintheloop.com 1
AI used in societally critical tasks Recidivism prediction Medical diagnosis Amazon secret AI Autonomous driving hiring tool Geiger et al. 2012; European Parliament 2016; Kleinberg et al. 2017; Dastin 2018 2
3
Explanations! 4
Explaining AI is tricky 5
� � Why is explaining AI tricky? Two distinct learning modes Discovering Emulating 6
� Why is explaining AI tricky? Two distinct learning modes Emulating 7
� Why is explaining AI tricky? Two distinct learning modes Discovering 8
� Why is explaining AI tricky? Two distinct learning modes Discovering AI can discover inconspicuous and counterintuitive patterns. 9
So, how can explaining AI be less tricky? Model-driven tutorials � Elucidate counterintuitive patterns � Enhance humans' ability to understand patterns 10
Model-driven tutorials: Guidelines State-of-the-art science communication 11
Model-driven tutorials: Examples How do we choose examples? • SP-LIME • Spaced repetition 12 Ribeiro et al. 2016
Model-driven tutorials: Examples How do we choose examples? • SP-LIME • Sp Spaced repetit itio ion 13 Ribeiro et al. 2016
Experimental Design & Research Questions R1: Effect of different tutorials Training � � Prediction 14
Experimental Design & Research Questions RQ1: Effect of different tutorials Different No � � tutorials assistance Training Prediction 15
Experimental Design & Research Questions Training RQ1: Effect of different tutorials 16
Experimental Design & Research Questions RQ2: Effect of real-time assistance Different Same � � real-time tutorial assistance (Spaced repetition) Training Prediction 17
Experimental Design & Research Questions Training RQ1: Effect of different tutorials � � Prediction RQ2: Effect of real-time assistance 18
Experimental Design & Research Questions RQ1 & RQ2 RQ3 Linear model Deep model 19
Experimental Design & Research Questions Training RQ1: Effect of different tutorials � RQ3: Effect � of model complexity Prediction RQ2: Effect of real-time assistance 20
Experimental Design & Research Questions Training RQ1: Effect of different tutorials � RQ3: Effect � of model complexity Prediction RQ2: Effect of real-time assistance Performed qualitative study to improve interface design. 21
Research question 1 Can model-driven tutorials improve human Model- Human driven performance without any accuracy? tutorials real-time assistance in the Training Prediction prediction phase? 22
Tutorials are useful to some extent Control 54.6% p=0.018* # of stars Guidelines 60.4% indicates p-values ***: p < 0.001 Spaced repetition 57.9% **: p < 0.01 *: p < 0.05 p=0.1 Spaced repetition 59.2% + guidelines 50 55 60 65 70 75 80 Accuracy (%) 23
Tutorials are useful to some extent Control 54.6% “ p=0.018* The tutorial is # of stars Guidelines 60.4% indicates p-values helpful but it’s ***: p < 0.001 just hard not Spaced repetition 57.9% **: p < 0.01 being able to *: p < 0.05 p=0.1 ” Spaced repetition reference it . 59.2% + guidelines 50 55 60 65 70 75 80 Accuracy (%) 24
Research question 2 ? If not, how do varying levels of real-time assistance in prediction phase affect human Full human performance after training? Full agency automation 25
Prediction: various levels of real-time assistance Signed explanations + Signed explanations predicted label + guidelines Unsigned explanations + predicted label + accuracy statement Signed explanations Signed explanations + predicted label + guidelines Full Full human automation agency Information from AI increases from left to right. 26
Prediction: various levels of real-time assistance Signed explanations + Signed explanations predicted label + guidelines + predicted label + accuracy statement Unsigned Signed Signed explanations + explanations explanations predicted label + guidelines Full Full human automation agency 27
Unsigned explanations Signed explanations 28
Real-time assistance improves performance No assistance 60.4% Unsigned 57.8% # of stars p=0.001*** indicates p-values Signed 70.7% ***: p < 0.001 p=0.001*** **: p < 0.01 Signed + predicted label 74% *: p < 0.05 + guidelines + accuracy Machine 86 50 60 70 80 90 Accuracy (%) 29
Signed highlights is sufficient No assistance 60.4% Unsigned 57.8% # of stars indicates p-values Signed 70.7% ***: p < 0.001 p>0.05 **: p < 0.01 Signed + predicted label 74% *: p < 0.05 + guidelines + accuracy Machine 86 50 60 70 80 90 Accuracy (%) 30
Gap between human+AI & AI No assistance 60.4% Unsigned 57.8% # of stars indicates p-values Signed 70.7% ***: p < 0.001 **: p < 0.01 Signed + predicted label 74% *: p < 0.05 + guidelines + accuracy Machine 86 50 60 70 80 90 Accuracy (%) Poursabzi-Sangdeh et al. 2018; Green & Chen 2019; Lage et al. 2019; Lai & Tan 2019; Carton et al. 2020; Lai et al. 2020 31
Research question 3 Can our results generalize in other models? How do vs. model complexity and explanation methods affect human performance Simple Deep model model with/without training? 32
SVM explanations BERT attention explanations 33
SVM explanations BERT LIME explanations 34
Simple model = better human performance 72.8% SVM 64.1% p=0.001*** # of stars indicates p-values 58.2% BERT-ATT Training 54.1% ***: p < 0.001 No training **: p < 0.01 p=0.001*** *: p < 0.05 64.9% BERT-LIME 59.2% 50 55 60 65 70 75 80 Accuracy (%) Lai et al. 2019 35
Simple model = better human performance 72.8% SVM 64.1% p=0.001*** 58.2% BERT-ATT Training 54.1% No training p=0.001*** 64.9% BERT-LIME 59.2% 50 55 60 65 70 75 80 Accuracy (%) 36
Training leads to better performance 72.8% SVM 64.1% p=0.001*** # of stars indicates p-values 58.2% BERT-ATT Training 54.1% ***: p < 0.001 No training p=0.001*** **: p < 0.01 *: p < 0.05 64.9% BERT-LIME 59.2% p=0.001*** 50 55 60 65 70 75 80 Accuracy (%) 37
Takeaway � Tutorials somewhat improve Vivian Lai, Han Liu, Chenhao Tan human performance @vivwylai | vivwylai@gmail.com � @HanLiuAI | @ChenhaoTan University of Colorado Boulder Explanations from simple models are preferred Website:machineintheloop.com � Paper:https://tinyurl.com/model- driven-tutorials Future directions for human- Workshop:https://tinyurl.com/harn centered tutorials and ess-explanations explanations 38
Recommend
More recommend