How Widely Can Prediction Models be Generalized? Performance Prediction in Blended Courses

Niki Gitinabard,Yiqiao Xu,Sarah Heckman,Tiffany Barnes,Collin F. Lynch
DOI: https://doi.org/10.1109/TLT.2019.2911832
2019-06-22
Abstract:Blended courses that mix in-person instruction with online platforms are increasingly popular in secondary education. These tools record a rich amount of data on students' study habits and social interactions. Prior research has shown that these metrics are correlated with students' performance in face to face classes. However, predictive models for blended courses are still limited and have not yet succeeded at early prediction or cross-class predictions even for repeated offerings of the same course. In this work, we use data from two offerings of two different undergraduate courses to train and evaluate predictive models on student performance based upon persistent student characteristics including study habits and social interactions. We analyze the performance of these models on the same offering, on different offerings of the same course, and across courses to see how well they generalize. We also evaluate the models on different segments of the courses to determine how early reliable predictions can be made. This work tells us in part how much data is required to make robust predictions and how cross-class data may be used, or not, to boost model performance. The results of this study will help us better understand how similar the study habits, social activities, and the teamwork styles are across semesters for students in each performance category. These trained models also provide an avenue to improve our existing support platforms to better support struggling students early in the semester with the goal of providing timely intervention.
Computers and Society,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is about the generalization ability of the student performance prediction model in hybrid courses. Specifically, the researchers are concerned with whether the prediction model constructed based on students' learning habits and social interaction characteristics can effectively predict students' academic performance in different courses or different offering periods of the same course. The main problems of the study can be summarized as the following aspects: 1. **How do different social graph generation methods affect the performance of prediction models based on these graphs?** - This question aims to explore the impact of social networks constructed in different ways on the effectiveness of prediction models. 2. **Which characteristics of students' learning habits and social connections can best predict students' performance?** - By analyzing students' online behavior data, identify which specific behavior patterns are associated with high grades. 3. **Using the data of the same course, how early after the start of the course can we accurately predict students' performance?** - Research whether early data can be used to make reliable predictions of student performance for timely intervention. 4. **Can the prediction model generated during one offering period of a course be transferred to another offering period of the same course?** - Explore the generalization ability of the model in different time periods of the same course. 5. **Can the prediction model generated on one course be transferred to another course?** - Analyze the generalization ability of the model between different courses. 6. **How do these models perform in identifying at - risk students?** - Evaluate the effectiveness of the model in early identification of students in need of additional support. Through the research of these questions, the author hopes to understand the generalization ability of prediction models based on students' persistent characteristics (such as learning habits and social interaction) in different courses and different offering periods, as well as the potential of these models in early prediction of student performance, so as to provide data support and technical means for improving the quality of education.