Temporal Models for Predicting Student Dropout in Massive Open Online Courses Fei Mi, Dit-Yan Yeung Hong Kong University of Science and Technology (HKUST) fmi@ust.hk (fei.mi@epfl.ch) November, 14th, 2015 Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 1 / 17
Outline Background and Motivation 1 Temporal Models 2 Experiments 3 Conclusion 4 Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 2 / 17
Outline Background and Motivation 1 Temporal Models 2 Experiments 3 Conclusion 4 Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 3 / 17
Overview Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 3 / 17
Overview 1 What can we do? Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 3 / 17
Overview 1 What can we do? Performance evaluation (Peer Grading) Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 3 / 17
Overview 1 What can we do? Performance evaluation (Peer Grading) Help students engage and perform better (Dropout Prediction) Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 3 / 17
Overview 1 What can we do? Performance evaluation (Peer Grading) Help students engage and perform better (Dropout Prediction) Build personalized platform (Recommendation) Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 3 / 17
Overview 1 What can we do? Performance evaluation (Peer Grading) Help students engage and perform better (Dropout Prediction) Build personalized platform (Recommendation) Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 4 / 17
Motivation of our work 1 High attrition rate commonly on MOOC platforms (60% � 80%) Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 5 / 17
Motivation of our work 1 High attrition rate commonly on MOOC platforms (60% � 80%) 2 Current methods: SVM, Logistic Regression Activity features (lecture video, discussion forum) Static models Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 5 / 17
Contribution of our work 1 A sequence labeling perspective 𝑧 1 𝑧 2 𝑧 3 𝑧 4 𝑧 𝑢 Labels Week 1 Week 2 Week 3 Week 4 Week t 𝒚 2 𝒚 3 𝒚 4 𝒚 𝑢 𝒚 1 Activities Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 6 / 17
Contribution of our work 1 A sequence labeling perspective 𝑧 1 𝑧 2 𝑧 3 𝑧 4 𝑧 𝑢 Labels Week 1 Week 2 Week 3 Week 4 Week t 𝒚 2 𝒚 3 𝒚 4 𝒚 𝑢 𝒚 1 Activities 2 Compare di ff erent temporal machine learning models Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 6 / 17
Contribution of our work 1 A sequence labeling perspective 𝑧 1 𝑧 2 𝑧 3 𝑧 4 𝑧 𝑢 Labels Week 1 Week 2 Week 3 Week 4 Week t 𝒚 2 𝒚 3 𝒚 4 𝒚 𝑢 𝒚 1 Activities 2 Compare di ff erent temporal machine learning models Input-output Hidden Markov Model (IOHMM) Recurrent Neural Network (RNN) RNN with long short-term memory (LSTM) cells Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 6 / 17
Outline Background and Motivation 1 Temporal Models 2 Experiments 3 Conclusion 4 Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 7 / 17
How to capture temporal information? Sliding window structures (NLP tasks): 1 Features aggregated using sliding window structure 2 Temporal span fixed by sliding window Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 7 / 17
How to capture temporal information? Sliding window structures (NLP tasks): 1 Features aggregated using sliding window structure 2 Temporal span fixed by sliding window Temporal models: 1 Learn from the previous inputs and the current input 2 Temporal pathway allows a “memory” of the previous inputs to persist in the internal state 3 Flexible temporal span, learn from data Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 7 / 17
Input-output Hidden Markov Model (IOHMM) Originated from HMM Learn to map input sequences to output sequences Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 8 / 17
Input-output Hidden Markov Model (IOHMM) Originated from HMM Learn to map input sequences to output sequences h t = Ah t − 1 + Bx t + N ( 0 , Q ) (1) y t = Ch t + N ( 0 , R ) … … ! "#$ Dropoutlabels ! "'$ ! " Hidden&states % &'( % & % &#( … … Input&features ) &'( ) & ) &#( IOHMM 1 Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 8 / 17
Vanilla Recurrent Neural Network (Vanilla RNN) RNN allows the network connections to form cycles. h t = H ( W 1 x t + W 2 h t − 1 + b h ) (2) y t = F ( W 3 h t + b y ) Left: Vanilla RNN structure; Right: Vanilla RNN unfolded Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 9 / 17
Drawbacks of RNN 1 Influence of an input either decays or blows up as it cycles the recurrent connection Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 10 / 17
Drawbacks of RNN 1 Influence of an input either decays or blows up as it cycles the recurrent connection 2 Vanishing gradient problem Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 10 / 17
Drawbacks of RNN 1 Influence of an input either decays or blows up as it cycles the recurrent connection 2 Vanishing gradient problem 3 The range of temporality that can be accessed in practice is usually quite limited 4 Dynamic state of regular RNN is short-term memory Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 10 / 17
Long Short-Term Memory Cell (LSTM) Hochreiter & Schimidhuber (1997) 1 solved the problem of getting an RNN to remember things for a long Information get into a cell 1 time. whenever the “input” gate is on Information stays in the cell so 2 long as the “forget” gate is closed Information can read from the 3 m n cell by turning the “output” gate on Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 11 / 17
Update Functions of LSTM m n i t = σ ( W xi x t + W hi h t − 1 + W ci c t − 1 + b i ) f t = σ ( W xf x t + W hf h t − 1 + W cf c t − 1 + b f ) c t = f t ⌦ c t − 1 + i t ⌦ tanh( W xc x t + W hc h t − 1 + b c ) (3) o t = σ ( W xo x t + W ho h t − 1 + W co c t − 1 + b o ) h t = o t ⌦ tanh( c t ) Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 12 / 17
Hybrid of LSTM Memory Cells and RNN (LSTM Network) … … … … … … Left: Hybrid of LSTM and RNN (LSTM network); Right: LSTM network unfolded Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 13 / 17
Outline Background and Motivation 1 Temporal Models 2 Experiments 3 Conclusion 4 Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 14 / 17
Datasets for Dropout Prediction “Science of Gastronomy”, six-week course (Coursera). 1 85394 ! 39877 2 Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 14 / 17
Datasets for Dropout Prediction “Science of Gastronomy”, six-week course (Coursera). 1 85394 ! 39877 2 “Introduction to Java Programming”, ten-week course (edX). 1 46972 ! 27629 2 Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 14 / 17
Dropout Definitions Three definitions capture di ff erent contexts of the student status in a course 1 Participation in the final week: whether a student will stay DEF1 to the end of the course [Yang et al.2013, Ramesh et al.2014, He et al.2015] DEF2 Last week of engagement: whether the current week is the last week the student has activities [Amnueypornsakul et al.2014, Kloft et al.2014, Sinha et al.2014, Sharkey and Sanders2014, Taylor et al.2014] DEF3 Participation in the next week: whether a student has activities in the comming week Three dropout definitions Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 15 / 17
Dropout Definitions Three definitions capture di ff erent contexts of the student status in a course 1 Participation in the final week: whether a student will stay DEF1 to the end of the course [Yang et al.2013, Ramesh et al.2014, He et al.2015] DEF2 Last week of engagement: whether the current week is the last week the student has activities [Amnueypornsakul et al.2014, Kloft et al.2014, Sinha et al.2014, Sharkey and Sanders2014, Taylor et al.2014] DEF3 Participation in the next week: whether a student has activities in the comming week Three dropout definitions Time Week 1 Week 2 Week 3 Week 4 Week 5 Features [7,34,9,2,0,7,5] [6,3,12,4,1,8,3] Zeros Zeros Zeros DEF1 1 1 1 1 1 DEF2 0 0 1 1 null DEF3 1 0 1 1 null An illustrative example for DEF1 - DEF3 Fei Mi, Dit-Yan Yeung (HKUST) ICDM ASSESS 2015 November, 14th, 2015 15 / 17
Recommend
More recommend