Knowledge Tracing As Language Processing: A Large-Scale Autoregressive Paradigm

Bojun Zhan,Teng Guo,Xueyi Li,Mingliang Hou,Qianru Liang,Boyu Gao,Weiqi Luo,Zitao Liu
DOI: https://doi.org/10.1007/978-3-031-64302-6_13
2024-01-01
Abstract:Knowledge tracing (KT) is the process of modelling students' cognitive states to forecast their future academic performance, using their historical learning interactions as a reference. Recent scholarly investigations have introduced a range of deep learning-based knowledge tracing (DLKT) methodologies, which have demonstrated considerable potential in their outcomes. Considering the excellent performance of large models in various domains, we have explored the possibility of migrating their architecture to the KT domain. We posit that the efficacy of the large language model (LLM) can be largely attributed to the utilization of an auto-regressive Transformer decoder, which facilitates the learning of comprehensive representations and the processing of extensive data. Hence, we propose a DLKT model, LLM-KT, which is based on the LLM architecture. This model addresses the long-term dependency between students' historical interactions and their subsequent performance through a stack of Transformer decoders. To fully utilize the potential of large models, we evaluated our model capabilities on EdNet, which is currently the world's largest real KT dataset. Through a series of quantitative and qualitative experimental analyses, we answer two key questions: (1) is it feasible to apply the LLM-like architectures in the KT domain? (2) can the continuous extension of models improve prediction performance in KT? To encourage reproducible research, we make our data and code publicly available at https://github.com/ai4ed/ AIED2024-LLM-KT.
What problem does this paper attempt to address?