Upload
mitchell-oliver
View
213
Download
0
Embed Size (px)
Citation preview
Introduction
Background Previous education research points to various
“persistence factors” such as abilities, motivation, time constraints, self-regulated learning skills … etc.
Learner activity features correlated with these persistence factors can potentially allow us to predict and diagnose dropout.
Predicting and Diagnosing Attrition in MOOCs
Sherif Halawa, Daniel Greene, and Pr. John Mitchell
Attrition rates in MOOCs usually exceed 80%, presenting interesting opportunities to study persistence and experiment with educational interventions. How can we predict which learners are at risk of dropout and predict their reasons for dropout? These questions are a key part of intervention design in MOOCs.
Data from multiple MOOCs was used to build training and test sets. Dropout labels were assigned using learner activity data, and labels for reasons of dropout were derived from a diagnostic survey on persistence factors.
Features were extracted from learner data, and models were constructed for dropout and each of its modeled reasons: lack of ability, lack of motivation, and lack of time.
Method
Dropout Prediction Results Dropout Diagnosis Results
Given the outputs of the dropout prediction and diagnosis models, how can we design interventions for amenable learners? To what extent can this help increase persistence in MOOCs?
Implications and Future Work
References S. Halawa, D. Greene, and J. Mitchell, “Dropout
prediction in MOOCs using learner activity features”, Proceedings of EMOOCs 2014, Feb 10-12 2014, Switzerland.
Predicting Dropout Diagnosing Dropout
Our study involved 20 MOOCs from different fields (computer science, political science, agriculture … etc)
Extracted features from learners' activity on videos, assessments, and forums.
We want to obtain a model that generalizes to many courses (sacrificing some prediction accuracy for generalizability).
Thus, we used the forward feature selection algorithm to choose features with best median prediction accuracy (recall and false positive rate (fpr)) over all courses in our dataset.
Resulting model:
Active mode features: yield a predictor that workswhile the student is still active
Absent mode feature: Adds allowance after thestudent stops engaging with the course
Avg score (2 or more assns) < 50%?
Lagging by > 2 weeks during 1st month?
Total absence > 14 days?
Skipped any videos?
Skipped any assessments?
Active modepredictor
Integrated predictor(Active mode + absent mode)
Using active mode only Using active mode + absent mode
Surveyed ~ 9,000 students on their level of motivation, time allowance, and difficulties experienced (~ 800 responses). Used the survey responses to attach labels to learners: Lack of motivation? Yes / No Lack of time? Yes / No Difficulty? Yes / No
Extracted features from learner's engagement with videos, assessments, and forums.
Built logistic regression models for predicting the labels assigned to each student.
?
Wee
k 1
Wee
k 2
Wee
k 3
?Forum
Study group
Examples of features Videos viewed / skipped
Assessments started Assessments completed Assessment grades
Forming / joining study groups
Sticking to courseschedule
Posting questions/answers tothe forum
Prediction accuracyfor lack of motivation
Prediction accuracyFor lack of time