predicting performance on mooc assessments using multi
play

Predicting Performance on MOOC Assessments using Multi-Regression - PowerPoint PPT Presentation

Predicting Performance on MOOC Assessments using Multi-Regression Models Zhiyun Ren, Huzefa Rangwala, Aditya Johri George Mason University 4400 University Drive, Fairfax, Virginia 22030 Outline q Background q Personal Linear Multi-Regression


  1. Predicting Performance on MOOC Assessments using Multi-Regression Models Zhiyun Ren, Huzefa Rangwala, Aditya Johri George Mason University 4400 University Drive, Fairfax, Virginia 22030

  2. Outline q Background q Personal Linear Multi-Regression Models q Feature selection q Experiments and discussion q Conclusion and future work

  3. Background

  4. Background

  5. Overview q Information we have: MOOC server log q Things we want to do: Predict student’s performance

  6. Challenge q Various kinds of participants q High attrition rate q Flexible timetable q Baselines we have tried: Linear regression model, meanscore

  7. Personal Linear Multi-Regression Models 1 2 6 * +, ! ",$ = & " + & ( + ) " "$ = & " + & ( + () ",. , "$,/ 0 .,/ ) .34 /34 𝑞 𝑡 𝑋 𝑔 𝑡𝑏 𝒐 𝑮 -- Number 𝒎 --Number of features of regression models 5 1 !"#"!"$% 4 + ( ; 4 ) + <( * + ( ) (0 1,2 − 0 1,2 ) 4 + :( * ; ((, *, +) 2/ 678

  8. Data structure (a) Homework and quiz (b) Video (c) Study session

  9. Feature selection q quiz related features q time related features q interval-based features q homework related features

  10. Feature selection q Video related features q Session features

  11. Experimental setup q Different motivations part the data into two groups. q Different models are applied for different data types.

  12. Experimental protocol q PreviousHW-based prediction …... HW1 HW2 HW3 HW4 q PreviousOneHW-based prediction …... HW1 HW2 HW3 HW4

  13. Experimental baseline: KT-IDEM P(L 0 ) P(T) P(T) K K K Model parameters P(L 0 ) = Initial Knowledge P(T) = Probability of learning P(G 1…n ) = Probability of guess per question …… P(G 1 ) P(G 2 ) P(G 3 ) P(S 1…n ) = Probability of slip per question Q Q Q P(S 1 ) P(S 2 ) P(S 3 ) n denotes the number of all questions. I I I

  14. Comparative Performance q Prediction results with varying number of regression models for student group with continuous grade value

  15. Comparative Performance q Prediction results with varying number of regression models for student group with binary grade value

  16. Comparative Performance q The comparison of the accuracy and F1 scores with baseline approaches.

  17. Feature Importance

  18. Feature Importance

  19. Conclusion and future work q Predict algorithm: personalized multiple linear regression model. q Experimental results: improved performance compared to baseline methods. q Other contribution: analysis of feature importance. q Future work: to set up an early warning system to help improve student’s performance

  20. Thank you!

Recommend


More recommend