Enhancing the Performance of Automated Grade Prediction in MOOC using Graph Representation Learning

Soheila Farokhi,Aswani Yaramala,Jiangtao Huang,Muhammad F. A. Khan,Xiaojun Qi,Hamid Karimi
2023-10-19
Abstract:In recent years, Massive Open Online Courses (MOOCs) have gained significant traction as a rapidly growing phenomenon in online learning. Unlike traditional classrooms, MOOCs offer a unique opportunity to cater to a diverse audience from different backgrounds and geographical locations. Renowned universities and MOOC-specific providers, such as Coursera, offer MOOC courses on various subjects. Automated assessment tasks like grade and early dropout predictions are necessary due to the high enrollment and limited direct interaction between teachers and learners. However, current automated assessment approaches overlook the structural links between different entities involved in the downstream tasks, such as the students and courses. Our hypothesis suggests that these structural relationships, manifested through an interaction graph, contain valuable information that can enhance the performance of the task at hand. To validate this, we construct a unique knowledge graph for a large MOOC dataset, which will be publicly available to the research community. Furthermore, we utilize graph embedding techniques to extract latent structural information encoded in the interactions between entities in the dataset. These techniques do not require ground truth labels and can be utilized for various tasks. Finally, by combining entity-specific features, behavioral features, and extracted structural features, we enhance the performance of predictive machine learning models in student assignment grade prediction. Our experiments demonstrate that structural features can significantly improve the predictive performance of downstream assessment tasks. The code and data are available in \url{<a class="link-external link-https" href="https://github.com/DSAatUSU/MOOPer_grade_prediction" rel="external noopener nofollow">this https URL</a>}
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve two key problems in the automatic grading prediction task in Massive Open Online Courses (MOOCs): 1. **Prediction granularity problem**: Existing automatic grading prediction methods usually focus on predicting the grades of large - scale assignments or the entire course, ignoring the small, specific - skill exercises (called "challenges") that occur frequently in MOOC systems. These small exercises are equally important because they help to understand students' learning progress and specific skill mastery in more detail. 2. **Ignoring structural relationships problem**: Most existing methods fail to fully consider the complex and rich structural relationships between students and courses. These structural relationships are represented by an interaction graph, which contains the internal connections between students and challenges. The author assumes that this structural information is of great significance for improving prediction performance. To solve these problems, the author proposes the following methods: - **Constructing a new dataset**: A new dataset from a large - scale MOOC provider in China is introduced, which contains interaction data of thousands of students with various entities (such as challenges, courses, chapters, etc.). - **Constructing an interaction graph**: The relationship between students and challenges is modeled as a bipartite graph, and advanced graph representation learning techniques (such as node2vec and DeepWalk) are used to extract dense vector representations at the entity level. - **Combining multiple features**: Entity - specific features, behavioral features, and the extracted structural features are fused to enhance the performance of machine - learning models in predicting students' challenge grades. Through the above methods, the author verifies that structured information can significantly improve the prediction performance of downstream evaluation tasks (such as challenge grade prediction). ### Key contributions - A new MOOC dataset is proposed, including various entities and their interactions, and will be publicly available to the research community. - Focus on the prediction tasks of short - term, small - scale, and specific exercises (challenges), rather than traditional course - level prediction. - A bipartite graph between students and challenges is constructed, and an unsupervised graph embedding method is used to extract dense entity - level vector representations. - The effectiveness of the interaction graph and its resulting representation in challenge grade prediction is verified through extensive experiments. ### Conclusion This research shows that using graph representation learning techniques to extract structural information from the interactions between students and challenges can significantly improve the performance of automatic grading prediction. This finding not only has practical application value for MOOC platforms but also provides a new direction for future research.