Why is Chicago deceptive?: Towards Building Model-Driven Tutorials - PowerPoint PPT Presentation

“Why is ‘Chicago’ deceptive?”: Towards Building Model-Driven Tutorials for Humans Vivian Lai, Han Liu, and Chenhao Tan @vivwylai | @HanLiuAI | @ChenhaoTan University of Colorado Boulder machineintheloop.com 1

AI used in societally critical tasks Recidivism prediction Medical diagnosis Amazon secret AI Autonomous driving hiring tool Geiger et al. 2012; European Parliament 2016; Kleinberg et al. 2017; Dastin 2018 2

Explanations! 4

Explaining AI is tricky 5

� � Why is explaining AI tricky? Two distinct learning modes Discovering Emulating 6

� Why is explaining AI tricky? Two distinct learning modes Emulating 7

� Why is explaining AI tricky? Two distinct learning modes Discovering 8

� Why is explaining AI tricky? Two distinct learning modes Discovering AI can discover inconspicuous and counterintuitive patterns. 9

So, how can explaining AI be less tricky? Model-driven tutorials � Elucidate counterintuitive patterns � Enhance humans' ability to understand patterns 10

Model-driven tutorials: Guidelines State-of-the-art science communication 11

Model-driven tutorials: Examples How do we choose examples? • SP-LIME • Spaced repetition 12 Ribeiro et al. 2016

Model-driven tutorials: Examples How do we choose examples? • SP-LIME • Sp Spaced repetit itio ion 13 Ribeiro et al. 2016

Experimental Design & Research Questions R1: Effect of different tutorials Training � � Prediction 14

Experimental Design & Research Questions RQ1: Effect of different tutorials Different No � � tutorials assistance Training Prediction 15

Experimental Design & Research Questions Training RQ1: Effect of different tutorials 16

Experimental Design & Research Questions RQ2: Effect of real-time assistance Different Same � � real-time tutorial assistance (Spaced repetition) Training Prediction 17

Experimental Design & Research Questions Training RQ1: Effect of different tutorials � � Prediction RQ2: Effect of real-time assistance 18

Experimental Design & Research Questions RQ1 & RQ2 RQ3 Linear model Deep model 19

Experimental Design & Research Questions Training RQ1: Effect of different tutorials � RQ3: Effect � of model complexity Prediction RQ2: Effect of real-time assistance 20

Experimental Design & Research Questions Training RQ1: Effect of different tutorials � RQ3: Effect � of model complexity Prediction RQ2: Effect of real-time assistance Performed qualitative study to improve interface design. 21

Research question 1 Can model-driven tutorials improve human Model- Human driven performance without any accuracy? tutorials real-time assistance in the Training Prediction prediction phase? 22

Tutorials are useful to some extent Control 54.6% p=0.018* # of stars Guidelines 60.4% indicates p-values ***: p < 0.001 Spaced repetition 57.9% **: p < 0.01 *: p < 0.05 p=0.1 Spaced repetition 59.2% + guidelines 50 55 60 65 70 75 80 Accuracy (%) 23

Tutorials are useful to some extent Control 54.6% “ p=0.018* The tutorial is # of stars Guidelines 60.4% indicates p-values helpful but it’s ***: p < 0.001 just hard not Spaced repetition 57.9% **: p < 0.01 being able to *: p < 0.05 p=0.1 ” Spaced repetition reference it . 59.2% + guidelines 50 55 60 65 70 75 80 Accuracy (%) 24

Research question 2 ? If not, how do varying levels of real-time assistance in prediction phase affect human Full human performance after training? Full agency automation 25

Prediction: various levels of real-time assistance Signed explanations + Signed explanations predicted label + guidelines Unsigned explanations + predicted label + accuracy statement Signed explanations Signed explanations + predicted label + guidelines Full Full human automation agency Information from AI increases from left to right. 26

Prediction: various levels of real-time assistance Signed explanations + Signed explanations predicted label + guidelines + predicted label + accuracy statement Unsigned Signed Signed explanations + explanations explanations predicted label + guidelines Full Full human automation agency 27

Unsigned explanations Signed explanations 28

Real-time assistance improves performance No assistance 60.4% Unsigned 57.8% # of stars p=0.001*** indicates p-values Signed 70.7% ***: p < 0.001 p=0.001*** **: p < 0.01 Signed + predicted label 74% *: p < 0.05 + guidelines + accuracy Machine 86 50 60 70 80 90 Accuracy (%) 29

Signed highlights is sufficient No assistance 60.4% Unsigned 57.8% # of stars indicates p-values Signed 70.7% ***: p < 0.001 p>0.05 **: p < 0.01 Signed + predicted label 74% *: p < 0.05 + guidelines + accuracy Machine 86 50 60 70 80 90 Accuracy (%) 30

Gap between human+AI & AI No assistance 60.4% Unsigned 57.8% # of stars indicates p-values Signed 70.7% ***: p < 0.001 **: p < 0.01 Signed + predicted label 74% *: p < 0.05 + guidelines + accuracy Machine 86 50 60 70 80 90 Accuracy (%) Poursabzi-Sangdeh et al. 2018; Green & Chen 2019; Lage et al. 2019; Lai & Tan 2019; Carton et al. 2020; Lai et al. 2020 31

Research question 3 Can our results generalize in other models? How do vs. model complexity and explanation methods affect human performance Simple Deep model model with/without training? 32

SVM explanations BERT attention explanations 33

SVM explanations BERT LIME explanations 34

Simple model = better human performance 72.8% SVM 64.1% p=0.001*** # of stars indicates p-values 58.2% BERT-ATT Training 54.1% ***: p < 0.001 No training **: p < 0.01 p=0.001*** *: p < 0.05 64.9% BERT-LIME 59.2% 50 55 60 65 70 75 80 Accuracy (%) Lai et al. 2019 35

Simple model = better human performance 72.8% SVM 64.1% p=0.001*** 58.2% BERT-ATT Training 54.1% No training p=0.001*** 64.9% BERT-LIME 59.2% 50 55 60 65 70 75 80 Accuracy (%) 36

Training leads to better performance 72.8% SVM 64.1% p=0.001*** # of stars indicates p-values 58.2% BERT-ATT Training 54.1% ***: p < 0.001 No training p=0.001*** **: p < 0.01 *: p < 0.05 64.9% BERT-LIME 59.2% p=0.001*** 50 55 60 65 70 75 80 Accuracy (%) 37

Takeaway � Tutorials somewhat improve Vivian Lai, Han Liu, Chenhao Tan human performance @vivwylai | vivwylai@gmail.com � @HanLiuAI | @ChenhaoTan University of Colorado Boulder Explanations from simple models are preferred Website:machineintheloop.com � Paper:https://tinyurl.com/model- driven-tutorials Future directions for human- Workshop:https://tinyurl.com/harn centered tutorials and ess-explanations explanations 38

Why is Chicago deceptive?: Towards Building Model-Driven Tutorials - PowerPoint PPT Presentation

Why is Chicago deceptive?: Towards Building Model-Driven Tutorials for Humans Vivian Lai, Han Liu, and Chenhao Tan @vivwylai | @HanLiuAI | @ChenhaoTan University of Colorado Boulder machineintheloop.com 1 AI used in societally

deceptive words (7:4, 8) Jeremiah 7:1-8:3 You shall die! (26:8) Jeremiah 7:1-8:3

Detecting deceptive reviews using Argument Mining Oana Cocarascu Imperial College London

Computing the unexpected and Unpredicted and Deceptive and Interesting Reflections

Priority-Driven Scheduling of Periodic Tasks Priority-driven vs. clock-driven scheduling:

False fasting is driven by pride False fasting is driven by pride False fasting is

A New Two- -Scale Mix Model: Towards Scale Mix Model: Towards a Multi a Multi- - A New Two A

CHICAGO: POWERED BY TECHNOLOGY IN CHICAGO, WE BELIEVE THAT THE POWER OF TECHNOLOGY IS DRIVEN BY

NETWORK MONITORING AND DECEPTIVE DEFENSES Michael Collins, RedJack mpcollins@redjack.com Brian

Placebos in Pain and Sadness Tobias Kube, PhD Agenda Study 1: Experimentally induced pain

NORTH CAROLINA UNFAIR AND DECEPTIVE TRADE PRACTICES ACT (UDTPA) Albert Diaz & John Jolly

Looks Can Be Deceptive Case presentation 37 y/o man Original kidney disease:

Fundraising State governments regulate a nonprofits charitable solicitations. Deceptive

Unfair, Deceptive, or Abusive Acts or Practices (UDAAP) Webinar 2012 Fran Sponsler, CRCM, CAMS,

Trust Me, Im From the Internet DECEPTIVE TACTICS FOR ONLINE COVERT OPERATIONS TOP

NAVIGATING THE NC UNFAIR AND DECEPTIVE TRADE PRACTICES ACT presented by: J. Patrick Haywood

Investigating the Acoustic Correlates of Deceptive Speech Christin Kirchhbel IAFPA, Vienna, 27

Learning Patient-Specific Lumped Models for Interactive Coronary Blood Flow Simulations Paper

Revue historique de m ethodes de couplage Boundary Conditions as Coupling Conditions

An Introduction to Coupling Conditions Homogeneous Heterogeneous Domain Decomposition Problems

Modal Quantifiers, Potential Infinity, and Yablo sequences Rafa Urbaniak (Ghent U., U. of Gda

Mobile Communications Mobility Management in 3GPP Networks Mobility Management in 3GPP Networks

SCALABLE DISTRIBUTED SUBGRAPH ENUMERATION AUTHORS: LONGBIN LAI LU QIN XUEMIN LIN YING ZHANG

S erie dexercices #1 IFT2030 17 janvier 2006 1 1 Simula % Jean Vaucher %

Prof. Bill Jones Leicester Vaughan College 12 th November 2020 Philip Larkin, The Trees

Why is Chicago deceptive?: Towards Building Model-Driven Tutorials - PowerPoint PPT Presentation

Why is Chicago deceptive?: Towards Building Model-Driven Tutorials for Humans Vivian Lai, Han Liu, and Chenhao Tan @vivwylai | @HanLiuAI | @ChenhaoTan University of Colorado Boulder machineintheloop.com 1 AI used in societally

deceptive words (7:4, 8) Jeremiah 7:1-8:3 You shall die! (26:8) Jeremiah 7:1-8:3

Detecting deceptive reviews using Argument Mining Oana Cocarascu Imperial College London

Computing the unexpected and Unpredicted and Deceptive and Interesting Reflections

Priority-Driven Scheduling of Periodic Tasks Priority-driven vs. clock-driven scheduling:

False fasting is driven by pride False fasting is driven by pride False fasting is

A New Two- -Scale Mix Model: Towards Scale Mix Model: Towards a Multi a Multi- - A New Two A

CHICAGO: POWERED BY TECHNOLOGY IN CHICAGO, WE BELIEVE THAT THE POWER OF TECHNOLOGY IS DRIVEN BY

NETWORK MONITORING AND DECEPTIVE DEFENSES Michael Collins, RedJack mpcollins@redjack.com Brian

Placebos in Pain and Sadness Tobias Kube, PhD Agenda Study 1: Experimentally induced pain

NORTH CAROLINA UNFAIR AND DECEPTIVE TRADE PRACTICES ACT (UDTPA) Albert Diaz &amp; John Jolly

Looks Can Be Deceptive Case presentation 37 y/o man Original kidney disease:

Fundraising State governments regulate a nonprofits charitable solicitations. Deceptive

Unfair, Deceptive, or Abusive Acts or Practices (UDAAP) Webinar 2012 Fran Sponsler, CRCM, CAMS,

Trust Me, Im From the Internet DECEPTIVE TACTICS FOR ONLINE COVERT OPERATIONS TOP

NAVIGATING THE NC UNFAIR AND DECEPTIVE TRADE PRACTICES ACT presented by: J. Patrick Haywood

Investigating the Acoustic Correlates of Deceptive Speech Christin Kirchhbel IAFPA, Vienna, 27

Learning Patient-Specific Lumped Models for Interactive Coronary Blood Flow Simulations Paper

Revue historique de m ethodes de couplage Boundary Conditions as Coupling Conditions

An Introduction to Coupling Conditions Homogeneous Heterogeneous Domain Decomposition Problems

Modal Quantifiers, Potential Infinity, and Yablo sequences Rafa Urbaniak (Ghent U., U. of Gda

Mobile Communications Mobility Management in 3GPP Networks Mobility Management in 3GPP Networks

SCALABLE DISTRIBUTED SUBGRAPH ENUMERATION AUTHORS: LONGBIN LAI LU QIN XUEMIN LIN YING ZHANG

S erie dexercices #1 IFT2030 17 janvier 2006 1 1 Simula % Jean Vaucher %

Prof. Bill Jones Leicester Vaughan College 12 th November 2020 Philip Larkin, The Trees

NORTH CAROLINA UNFAIR AND DECEPTIVE TRADE PRACTICES ACT (UDTPA) Albert Diaz & John Jolly