oppo opportunit unities ies f for human human ai ai col
play

Oppo Opportunit unities ies f for Human Human-AI AI Col - PowerPoint PPT Presentation

Oppo Opportunit unities ies f for Human Human-AI AI Col Collabor orati tive Tool ools s to o Advance De Develop opment t of of Moti Motivati tion on An Analy lytic ics Steven C. Dang and Kenneth R. Koedinger 10 th


  1. Oppo Opportunit unities ies f for Human Human-AI AI Col Collabor orati tive Tool ools s to o Advance De Develop opment t of of Moti Motivati tion on An Analy lytic ics Steven C. Dang and Kenneth R. Koedinger 10 th International Learning Analytics and Knowledge Conference Workshop on Learning Analytic Services to Support Personalized Learning & Assessment at Scale

  2. Cr Crystal Island Narrative-centered Learning Environment 2

  3. Operationalizing on New Systems • Off-task behavior can be indicative of cognitive engagement (Baker et al, 2004) • Rowe et al (2009) operationalized off- task behavior for Crystal Island • Narrative contains elements of “seductive detail” • Off-task = any student behavior that involves locations or objects not necessary for solving CRYSTAL ISLAND’S Crystal Island Narrative-centered science mystery Learning Environment 3

  4. Accuracy of Construct Operationalization • Results raised construct validity questions: • Off-task behavior not related to pre-post learning • No relationship to achievement orientation or self-efficacy Crystal Island Narrative-centered Learning Environment 4

  5. World Data Model 5

  6. Data Iteration Model Iteration World Data Model 6

  7. Talk Overview 1. The problem of confounding constructs 2. Leveraging Behavior-based Psychometric Scales 3. Common Challenges and Opportunities 7

  8. Confounding Constructs (Huggins-Manley et al, 2019) • Mono-operation bias threat • When a single indicator underrepresents a construct because the construct is more complex than a single indicator • Student motivations impact many student behaviors 8

  9. Leveraging Behavior-based Scales (under review) • Academic Diligence • “working assiduously on academic tasks which are beneficial in the long-run but tedious in the moment, especially in comparison to more enjoyable, less effortful diversions “ (Galla et al, 2014) • Operational Measures: • Time-on-task, Problems Completed 9

  10. Leveraging Behavior-based Scales (under review) • Academic Diligence • “working assiduously on academic tasks which are beneficial in the long-run but tedious in the moment, especially in comparison to more enjoyable, less effortful diversions “ (Galla et al, 2014) • Operational Measures: • Time-on-task, Problems Completed • Conflated with knowledge measures. 10

  11. 50 Minute Class Period 11

  12. 50 Minute Class Period Student 1 Student 2 Student 3 Student 4 Student 5 12

  13. 50 Minute Class Period Student 1 Student 2 Start Speed Student 3 Sustained Effort Student 4 Early Finish Student 5 13

  14. 12 Measure Behavior-based Scale 1 Start speed Absolute Mean 2 Variance 3 Scaled Mean 4 Variance 5 Sustained Effort Absolute Mean 6 Variance 7 Scaled Mean 8 Variance 9 Early Finish Absolute Mean 10 Variance 11 Scaled Mean 12 Variance 14

  15. Psychometric Validation of the Scale • Factor Analysis Yielded 2 Factors • Start Speed and Sustained Effort • related to Math Interest & Self-efficacy • Early Finishing • related to Effort Regulation 15

  16. Psychometric Validation of the Scale • Factor Analysis Yielded 2 Factors • Start Speed and Sustained Effort • related to Math Interest & Self-efficacy • Early Finishing • related to Effort Regulation • Goal was to identify less, knowledge dependent measures 16

  17. Combined measure yielded the best predictive model and was also reliable Final Grade ~ Gender + Ethnicity + SES + Prior Grade + Absenteeism + Diligence + (1 | Class) Start Speed Sustain Effort 1 2 3 4 5 6 7 8 Start Speed Sustain Effort 1 2 3 4 5 6 7 8 Start Speed Start Speed Sustain Effort Sustain Effort 1 3 2 4 5 7 6 8 Start Speed Start Speed Sustain Effort Start Speed Start Speed Sustain Effort Sustain Effort Sustain Effort 1 2 3 4 5 6 7 8 17

  18. Combined measure yielded the best predictive model and was also reliable Final Grade ~ Gender + Ethnicity + SES + Prior Grade + Absenteeism + Diligence + (1 | Class) Start Speed Sustain Effort 1 2 3 4 5 6 7 8 Start Speed Sustain Effort 1 2 3 4 5 6 7 8 Start Speed Start Speed Sustain Effort Sustain Effort 1 3 2 4 5 7 6 8 Start Speed Start Speed Sustain Effort Start Speed Start Speed Sustain Effort Sustain Effort Sustain Effort 1 2 3 4 5 6 7 8 18

  19. Common Challenges and Opportunities by Leveraging Behavior-based Scales Defining Models Iterating on Models 19

  20. Defining Models 20

  21. Model Parameter Setting • Aleven et al (2006) derived a model for help- seeking strategies from Self-Regulated Learning theory • Defined thresholds for “Familiar-at-all” and “Sense of what to do” • Set to values that were “intuitively plausible, given our past experience” • Behavior-based Scale ~ Past Experience • developers can utilize data to similarly inform thresholds based on theory- informed expectations 21

  22. Deriving Fully Machine Learned Models • Baker et al (2004) derived a wide range of features for input into the algorithm • eg: P(know), time-on-last-3, help-in- last-8, etc. • Linear, quadratic, and interactions • Mathematical transforms of raw input data are common and valuable data science process tools 22

  23. Deriving Fully Machine Learned Models • Featuretools: Automating Feature engineering with deep learning (Kanter & Veeramachaneni, 2015) 23

  24. Hyperparameter setting (Kuvalja, et al, 2014) • Analyzed patterns of children’s self-directed speech for measuring children’s self-regulated learning • Required setting hyperparameters of the algorithm • (eg: minimum number of occurrences, probability of observing a pattern threshold) • Expert knowledge informed priors to set these thresholds • Behavior-based scale ~ Expert Knowledge • Given a target for a machine learning problem, autonomous ML algorithms can automatically find optimal values for hyperparameters on a representative sample of data. (Kandasamey et al, 2019) 24

  25. Common Challenges and Opportunities by Leveraging Behavior-based Scales Defining Models Iterating on Models 25

  26. Accuracy of Construct Operationalization • Identified Gaming in high and low post-test regardless of pre-test (Baker et al, 2004) • Hurt and not-hurt gaming behaviors appeared to be differentiable (Baker et al, 2008) 26

  27. Accuracy of Construct Operationalization • Identified Gaming in high and low post-test regardless of pre-test (Baker et al, 2004) • Hurt and not-hurt gaming behaviors appeared to be differentiable (Baker et al, 2008) • Reflection after bottom-out hints is linked to learning (Shih et al, 2008) 27

  28. Accuracy of Construct Operationalization • Identified Gaming in high and low post-test regardless of pre-test (Baker et al, 2004) • Hurt and not-hurt gaming behaviors appeared to be differentiable (Baker et al, 2008) • Reflection after bottom-out hints is linked to learning (Shih et al, 2008) Quick Help Quick Help Quick Help Request Request Request 28

  29. Accuracy of Construct Operationalization • Identified Gaming in high and low post-test regardless of pre-test (Baker et al, 2004) • Hurt and not-hurt gaming behaviors appeared to be differentiable (Baker et al, 2008) • Reflection after bottom-out hints is linked to learning (Shih et al, 2008) Quick Help Quick Help Quick Help Unexpectedly Request Request Request Slow Attempt 29

  30. Supporting Model Iteration • Support Qualitative analysis for behavior discovery • Text-replay method (Baker & de Carvalho, 2008) • Overwhelming quantity of Data • 1 Class of 15 students @ 2x/week = 200k transactions 30

  31. Supporting Model Iteration • Leveraging supervision signal to guide search through data • Extending Hurt vs Non-hurt analysis • Identify outlier students based on theoretically informed expectations • Narrows transactions (15k) • Need additional work to investigate how to leverage explainable-ai work to support more efficient browsing of sequential behavior data for anomalous patterns 31

  32. Conclusion • Behavior-based Psychometric scales yield more valid & reliable measurement • More valid measurements lead to better initial analytic models • Scales allow human experts to embed theoretical expectations into the data and algorithms can leverage this information to more intelligently tackle many data science tasks • Opportunity to investigate how tools can leverage behavior scale information to support qualitative analysis processes to identify shortcomings in the operationalized construct 32

  33. Acknowledgements Ken Koedinger, Matt Bernacki, Queenie Kravitz, David Klahr, Audrey Russo, Sharon Carver, Franceska Xhakaj, Ken Holstein, Julian Ramos, Judith Tucker Questions? 33

Recommend


More recommend