Predicting Performance on MOOC Assessments using Multi-Regression Models Zhiyun Ren, Huzefa Rangwala, Aditya Johri George Mason University 4400 University Drive, Fairfax, Virginia 22030
Outline q Background q Personal Linear Multi-Regression Models q Feature selection q Experiments and discussion q Conclusion and future work
Background
Background
Overview q Information we have: MOOC server log q Things we want to do: Predict student’s performance
Challenge q Various kinds of participants q High attrition rate q Flexible timetable q Baselines we have tried: Linear regression model, meanscore
Personal Linear Multi-Regression Models 1 2 6 * +, ! ",$ = & " + & ( + ) " "$ = & " + & ( + () ",. , "$,/ 0 .,/ ) .34 /34 𝑞 𝑡 𝑋 𝑔 𝑡𝑏 𝒐 𝑮 -- Number 𝒎 --Number of features of regression models 5 1 !"#"!"$% 4 + ( ; 4 ) + <( * + ( ) (0 1,2 − 0 1,2 ) 4 + :( * ; ((, *, +) 2/ 678
Data structure (a) Homework and quiz (b) Video (c) Study session
Feature selection q quiz related features q time related features q interval-based features q homework related features
Feature selection q Video related features q Session features
Experimental setup q Different motivations part the data into two groups. q Different models are applied for different data types.
Experimental protocol q PreviousHW-based prediction …... HW1 HW2 HW3 HW4 q PreviousOneHW-based prediction …... HW1 HW2 HW3 HW4
Experimental baseline: KT-IDEM P(L 0 ) P(T) P(T) K K K Model parameters P(L 0 ) = Initial Knowledge P(T) = Probability of learning P(G 1…n ) = Probability of guess per question …… P(G 1 ) P(G 2 ) P(G 3 ) P(S 1…n ) = Probability of slip per question Q Q Q P(S 1 ) P(S 2 ) P(S 3 ) n denotes the number of all questions. I I I
Comparative Performance q Prediction results with varying number of regression models for student group with continuous grade value
Comparative Performance q Prediction results with varying number of regression models for student group with binary grade value
Comparative Performance q The comparison of the accuracy and F1 scores with baseline approaches.
Feature Importance
Feature Importance
Conclusion and future work q Predict algorithm: personalized multiple linear regression model. q Experimental results: improved performance compared to baseline methods. q Other contribution: analysis of feature importance. q Future work: to set up an early warning system to help improve student’s performance
Thank you!
Recommend
More recommend