Abstract:As online courses become the norm in the higher-education landscape, investigations into student performance between students who take online vs on-campus versions of classes become necessary. While attention has been given to looking at differences in learning outcomes through comparisons of students' end performance, less attention has been given in comparing students' engagement patterns between different modalities. In this study, we analyze a heterogeneous knowledge graph consisting of students, course videos, formative assessments and their interactions to predict student performance via a Graph Convolutional Network (GCN). Using students' performance on the assessments, we attempt to determine a useful model for identifying at-risk students. We then compare the models generated between 5 on-campus and 2 fully-online MOOC-style instances of the same course. The model developed achieved a 70-90\% accuracy of predicting whether a student would pass a particular problem set based on content consumed, course instance, and modality.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the predictive difference in student performance between online courses and traditional on - campus courses. Specifically, by analyzing a Heterogeneous Knowledge Graph (HKG) which contains students, course videos, formative assessments and their interactions, the researchers use Graph Convolutional Network (GCN) to predict students' academic performance and identify students who may be at risk. The focus of the study is to compare the differences in student participation patterns and performance prediction models of the same course in different modes (i.e., online mode and on - campus mode). ### Research Background As online education has become the norm in higher education, it is necessary to study the differences in student performance between online courses and traditional on - campus courses. Although previous studies have focused on the differences in students' final grades, relatively few studies have been conducted on students' participation patterns in different learning modes. In this paper, by constructing a heterogeneous knowledge graph containing students, course videos, formative assessments and their interactions, Graph Convolutional Network (GCN) is used to predict students' performance and attempt to identify high - risk students who may need early intervention. ### Research Methods 1. **Data Sources**: The research data comes from the GTX1301 "Introduction to Python" course at Georgia Tech, which is offered in both online and on - campus versions. The data set includes click - stream data, student feature matrices, video/assessment feature matrices, course content edge matrices, student - content edge matrices and student - page edge matrices. 2. **Model Construction**: The researchers used PyTorch Geometric (PyG) to construct a Graph Convolutional Network (GCN), which contains two Sage Convolution layers and a ReLU activation function. After the input node data is processed by standard embedding and linear transformation, the dot product between user and page nodes is calculated to predict whether a student will pass a particular page. 3. **Model Training**: The GCN model is trained with 64 hidden layers and 4 output layers. The training data is divided into 80% training set, 10% validation set and 10% test set. During the training process, PyTorch's binary cross - entropy loss function and Adam optimizer are used. ### Research Results - **Prediction Accuracy**: The GCN model achieved AUC scores of 58% - 90% in predicting whether students will pass a particular set of questions. - **Modal Differences**: There are significant differences in the prediction models of online courses and on - campus courses. The AUC score of the on - campus course in 2021 reached 90%, while the AUC score in 2022 was 82%. The lower AUC score of the on - campus course in 2022 may be due to the lack of data in the fall semester of 2022. - **Repeatability and Transferability**: In the training of a single course instance, the AUC score of the GCN model fluctuates greatly. This may be due to the floor effect when dividing the data, especially in on - campus courses with a small number of users. In addition, the differences in data shapes between different course instances lead to poor transferability of the model between different courses. ### Conclusions and Limitations - **Conclusions**: The study extends the previous click - stream GCN model by adding student - assessment interactions, but mainly focuses on one course. Future research needs to further study different courses, topics and degree paths. - **Limitations**: There is limited understanding of the demographic characteristics of students participating in these courses. Future plans include developing more powerful graph network implementations, including more demographic information. In addition, a scalable method needs to be developed to infer the relationships between different course contents to improve the transferability of the model. ### Formula Display - **Dot Product Calculation**: \[ \text{Edge Prediction}=\text{dot}(u, p) \] where \(u\) is the user node and \(p\) is the page node. - **Loss Function**: \[ \text{Loss}=\text{BCEWithLogitsLoss}(\hat{y}, y) \] where \(\hat{y}\) is the model's predicted value and \(y\) is the true label. - **AUC Calculation**

A Comparative Analysis of Student Performance Predictions in Online Courses using Heterogeneous Knowledge Graphs

Enhancing the Performance of Automated Grade Prediction in MOOC using Graph Representation Learning

How Widely Can Prediction Models be Generalized? Performance Prediction in Blended Courses

Predicting Cognitive Presence in At-Scale Online Learning: MOOC and For-Credit Online Course Environments

Academic Performance Estimation with Attention-based Graph Convolutional Networks

Graph-based Student Knowledge Profile for Online Intelligent Education

Graph-based Exercise- and Knowledge-Aware Learning Network for Student Performance Prediction

A Predictive Model for Student Performance in Classrooms Using Student Interactions With an eTextbook

CLGT: A Graph Transformer for Student Performance Prediction in Collaborative Learning

A Social Network Analysis on Blended Courses

The construction of knowledge graphs based on associated STEM concepts in MOOCs and its guidance for sustainable learning behaviors

Integrating learners' knowledge background to improve course recommendation fairness: A multi-graph recommendation method based on contrastive learning

Predicting Student Performance from Online Engagement Activities Using Novel Statistical Features

Predicting Learning Behavior Using Log Data in Blended Teaching

Improving academic performance predictions with dual graph neural networks

Research on Course Recommendation Algorithm Based on Knowledge Graph

Characterization of Student’s Performance in Massive Open Online Courses (MOOC)

Modeling and Predicting Learning Behavior in MOOCs

Recommending Learning Objects through Attentive Heterogeneous Graph Convolution and Operation-Aware Neural Network

Online course evaluation model based on graph auto-encoder

Students churn prediction task in MOOC