MOOCCubeX: A Large Knowledge-centered Repository for Adaptive Learning in MOOCs
Jifan Yu,Yuquan Wang,Qingyang Zhong,Gan Luo,Yiming Mao,Kai Sun,Wenzheng Feng,Wei Xu,Shulin Cao,Kaisheng Zeng,Zijun Yao,Lei Hou,Yankai Lin,Peng Li,Jie Zhou,Bin Xu,Juanzi Li,Jie Tang,Maosong Sun
DOI: https://doi.org/10.1145/3459637.3482010
2021-01-01
Abstract:The prosperity of massive open online courses provides fodder for plentiful research efforts on adaptive learning. However, current open-access educational datasets are still far from sufficient to meet the need for various topics of adaptive learning. Existing released datasets often cover only small-scale data, lack fine-grained knowledge concepts. They are even difficult to curate and supplement due to platform limitations. In this work, we construct MOOCCubeX, a large, knowledge-centered repository consisting of 4, 216 courses, 230, 263 videos, 358, 265 exercises, 637, 572 fine-grained concepts and over 296 million behavioral data of 3, 330, 294 students, for supporting the research topics on adaptive learning in MOOCs. Licensed by XuetangX, one of the largest MOOC websites in China, we obtain abundant and diverse course resources and student behavioral data and are permitted to make subsequent periodic updates. We propose a framework to accomplish data processing, weakly supervised fine-grained concept graph mining, and data curation to improve usability and richness. Based on the fine-grained concepts, we re-organize the data from the knowledge perspective and acquire more external learning resources from the web. Our repository is now available at https://github.com/THU- KEG/MOOCCubeX.