Revealing the Hidden Patterns: A Comparative Study on Profiling Subpopulations of MOOC Students

Lei Shi,Alexandra I. Cristea,Armando M. Toda,Wilk Oliveira
DOI: https://doi.org/10.48550/arXiv.2008.05850
2020-08-12
Abstract:Massive Open Online Courses (MOOCs) exhibit a remarkable heterogeneity of students. The advent of complex "big data" from MOOC platforms is a challenging yet rewarding opportunity to deeply understand how students are engaged in MOOCs. Past research, looking mainly into overall behavior, may have missed patterns related to student diversity. Using a large dataset from a MOOC offered by FutureLearn, we delve into a new way of investigating hidden patterns through both machine learning and statistical modelling. In this paper, we report on clustering analysis of student activities and comparative analysis on both behavioral patterns and demographical patterns between student subpopulations in the MOOC. Our approach allows for a deeper understanding of how MOOC students behave and achieve. Our findings may be used to design adaptive strategies towards an enhanced MOOC experience
Human-Computer Interaction,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to reveal the hidden patterns of student behavior and demographic characteristics in Massive Open Online Courses (MOOCs) through machine learning and statistical modeling. Specifically, the researchers hope to find different subgroups of students and analyze the behavioral and demographic patterns within these subgroups. ### Specific description of the problem 1. **How to find different subgroups of MOOC students?** - The researchers used the clustering analysis method to identify subgroups of students with different behavior patterns. They selected three key variables: number of visits, number of attempts, and number of comments, and carried out clustering analysis through the k - means algorithm. 2. **Are there behavioral and demographic patterns within these subgroups?** - The researchers further compared the differences in behavioral and demographic characteristics among different subgroups. They analyzed the completion rate, correct answer rate, and response rate of each subgroup, and explored the differences in gender and age distribution. ### Research background MOOCs (Massive Open Online Courses) have attracted millions of students around the world, but the diversity of student backgrounds and behaviors makes it complex to understand and optimize these courses. Past research has mainly focused on overall behavior and may have overlooked the potential patterns of student diversity. Therefore, this study aims to reveal hidden patterns by in - depth analysis of student behavior and demographic characteristics, thereby providing a basis for designing a more adaptable MOOC experience. ### Overview of methods - **Data source**: The study used the data of the MOOC "Shakespeare and his World" on the FutureLearn platform. - **Variable selection**: Based on correlation and distribution characteristics, three relatively independent key variables representing student engagement were selected: number of visits, number of attempts, and number of comments. - **Clustering analysis**: Clustering analysis was carried out on 13,971 active students using the k - means algorithm, and 7 different subgroups of students were determined. - **Comparative analysis**: The differences in behavioral and demographic characteristics among different subgroups were compared through non - parametric tests (such as Kruskal - Wallis H test and Mann - Whitney U test). ### Main findings - **Clustering results**: 7 different subgroups of students were found, and each subgroup had significant differences in the number of visits, number of attempts, and number of comments. - **Behavior patterns**: Some subgroups (such as Cluster 1) were active in attempting to answer questions, but had lower correct answer rates and completion rates; other subgroups (such as Cluster 2) showed a high level of social interaction, but were not necessarily the best in academic performance. - **Demographic patterns**: There were also significant differences in gender and age distribution among different subgroups. For example, Cluster 2 was mainly composed of young or old women, while Cluster 4 contained more elderly students. ### Conclusion Through clustering analysis and comparative analysis, the study has revealed the hidden behavioral and demographic patterns among MOOC students. These findings are helpful for understanding the learning behaviors of different types of students and provide an important reference for designing a more personalized and adaptable MOOC experience.