Interpretive Analyses of Learner Dropout Prediction in Online STEM Courses
Junling Lu,Jie Mou,Peng Li
DOI: https://doi.org/10.1109/cste59648.2023.00020
2023-01-01
Abstract:Contribution: Exploring the interpretive effects, by adopting SHapley Additive exPlanations (SHAP), of online behaviors and background information on the dropout for overall learners and an individual learner in online Science, Technology, Engineering and Mathematics (STEM) courses. The research findings help provide the general guidance for reducing the dropout rate and improve the performance of specific learners in personalized learning. Background: The widespread use of machine learning in education field lacks intuitive explanations to the predication outcomes of black box models. It is vital to analyze visually the relations between influence factors such as behaviors and background information and prediction outcomes of learner dropout in a human understanding way, which help enhance the trustworthiness of predicted models and take the corresponding measures to reduce learner dropout in online STEM courses. Research Questions: 1) How do learners' behaviors and background information individually and interactively influence the dropout of online STEM courses as a whole? 2) How are intuitive and understandable explanations to dropout provided for an individual learner in online STEM courses? Methodology: Data of STEM courses which contains six behavior features and five background information features is extracted from Open University Learning Analytics Dataset (OULAD). Focusing on the impact of behaviors and background information on dropout prediction for overall learners and an individual learner, a SHAP based interpretive study is conducted based on the best model selected from eXtreme Gradient Boosting (XGBoost), Multi Layer Perceptron (MLP), Support Vector Machine (SVM) and Logit Regression (LR). Findings: For the dropout of STEM courses, oucontent, subpage, resourse, ouwiki, education, forumng, age, url, credits, gender and attempts are ordered decreasingly by feature importance, and the magnitude of feature value corresponds to negative or positive influence to the dropout risk. In addition, groups of learners are hierarchically clustered via similar explanation and feature importance, and there also exist interactions to some extent between some of the above factors. Finally, the degrees of impacts imposed by the negative and positive factors on the dropout vary with different individual learners.