Accurate Multi-Category Student Performance Forecasting at Early Stages of Online Education Using Neural Networks

Naveed Ur Rehman Junejo,Muhammad Wasim Nawaz,Qingsheng Huang,Xiaoqing Dong,Chang Wang,Gengzhong Zheng
2024-12-08
Abstract:The ability to accurately predict and analyze student performance in online education, both at the outset and throughout the semester, is vital. Most of the published studies focus on binary classification (Fail or Pass) but there is still a significant research gap in predicting students' performance across multiple categories. This study introduces a novel neural network-based approach capable of accurately predicting student performance and identifying vulnerable students at early stages of the online courses. The Open University Learning Analytics (OULA) dataset is employed to develop and test the proposed model, which predicts outcomes in Distinction, Fail, Pass, and Withdrawn categories. The OULA dataset is preprocessed to extract features from demographic data, assessment data, and clickstream interactions within a Virtual Learning Environment (VLE). Comparative simulations indicate that the proposed model significantly outperforms existing baseline models including Artificial Neural Network Long Short Term Memory (ANN-LSTM), Random Forest (RF) 'gini', RF 'entropy' and Deep Feed Forward Neural Network (DFFNN) in terms of accuracy, precision, recall, and F1-score. The results indicate that the prediction accuracy of the proposed method is about 25% more than the existing state-of-the-art. Furthermore, compared to existing methodologies, the model demonstrates superior predictive capability across temporal course progression, achieving superior accuracy even at the initial 20% phase of course completion.
Machine Learning,Computers and Society
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the early and accurate prediction of students' multi - category academic performance in the online education environment. Specifically, the paper aims to develop a new neural - network - based method that can accurately predict students' academic achievements in the early stages of the course and identify students at risk of dropping out. ### Problem Background The high dropout rate in the online education environment is an urgent problem to be solved. Accurate prediction of students' performance is crucial for reducing the dropout rate. Although many studies have focused on binary - classification prediction (such as pass or fail), relatively few studies have been conducted in multi - category classification tasks (such as distinction, fail, pass, and dropout). In addition, the prediction accuracy of existing models in the early stages of the course is low, making it difficult to provide timely and effective intervention measures. ### Paper Objectives The main objectives of the paper are: 1. **Develop a new neural - network model**: This model can handle multi - category classification tasks and predict students' academic performance. 2. **Improve the accuracy of early prediction**: In particular, it can accurately predict students' academic achievements in the first 20% stage after the start of the course. 3. **Identify at - risk students**: Help teachers take timely intervention measures to reduce the dropout rate. ### Main Contributions The main contributions of the paper include: - Proposing a new data pre - processing pipeline, effectively integrating multiple data set files and retaining key features. - Introducing innovative feature - engineering techniques to extract useful information from multiple data sets and enhance the accuracy of early student performance prediction. - Using a one - dimensional convolutional neural - network (1D - CNN) model for multi - category classification, achieving excellent accuracy, precision, recall, and F1 - score. - Experimental results show that the proposed model is approximately 25% more accurate in prediction than existing baseline models, especially in the early stages of the course. ### Method Overview To achieve these goals, the paper uses the Open University Learning Analytics (OULA) data set and performs the following steps: 1. **Data pre - processing**: including data aggregation, handling missing values, feature engineering, data merging, encoding, feature scaling, correlation analysis, etc. 2. **Construct and train the 1D - CNN model**: Design a neural - network model that includes convolutional layers, fully - connected layers, batch normalization, and dropout techniques. 3. **Evaluate model performance**: Evaluate the model performance through metrics such as accuracy, precision, recall, and F1 - score, and compare it with other baseline models. Through these methods, the paper successfully addresses the challenges of multi - category academic performance prediction in the online education environment and provides strong support for improving the quality of online education.