Abstract:Knowledge Tracing (KT) is vital in educational data mining, enabling personalized learning by tracking learners' knowledge states and forecasting their academic outcomes. This study introduces the LOKT (Large Language Model Option-weighted Knowledge Tracing) model to address the cold start problem where limited historical data available using large language models (LLMs). While traditional KT models have incorporated option weights, our research extends this by integrating these weights into an LLM-based KT framework. Moving beyond the binary classification of correct and incorrect responses, we emphasize that different types of incorrect answers offer valuable insights into a learner's knowledge state. By converting these responses into text-based ordinal categories, we enable LLMs to assess learner understanding with greater clarity, although our approach focuses on the final knowledge state rather than the progression of learning over time. Using five public datasets, we demonstrate that the LOKT model sustains high predictive accuracy even with limited data, effectively addressing both "learner cold-start" and "system cold-start" scenarios. These findings showcase LOKT's potential to enhance LLM-based learning tools and support early-stage personalization.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper aims to solve the cold - start problem in Knowledge Tracing (KT), especially in educational data mining, that is, how to accurately assess learners' learning states and predict their academic achievements when historical data is limited. Specifically, the paper proposes the LOKT (Large Language Model Option - weighted Knowledge Tracing) model, which addresses the cold - start challenge by integrating large language models (LLMs) and option weights. #### Background of the cold - start problem In cold - start situations, due to the lack of sufficient interaction data, traditional KT models are difficult to provide accurate initial assessments, thus affecting the provision of personalized feedback and learning paths. This limitation is particularly evident for new users or new systems. Existing methods usually simply divide learners' answers into correct or incorrect, ignoring the valuable information that different types of wrong answers may provide. #### Core improvements of the LOKT model 1. **Introduction of option weights**: The LOKT model not only considers whether the answer is correct or not, but also combines the selection frequency and weight of each option. In this way, the model can more delicately capture the learner's understanding level and distinguish different degrees of misunderstanding. 2. **Textualization of option weights**: In order to improve the LLM's understanding of option weights, the paper proposes to convert continuous weight values into ordinal categories in text form (such as "proficient", "partially understood", "limited", "insufficient"). This method enhances the LLM's ability to interpret numerical data, making it better understand the learner's knowledge state. 3. **Improvement of performance in cold - start scenarios**: Experimental results show that the LOKT model can maintain high prediction accuracy when dealing with "learner cold - start" and "system cold - start" scenarios, even when the amount of data is limited. ### Formula summary - Learner score calculation formula: \[ \bar{x}_i=\frac{N_i-\sum_{q \in Q_i} d_q}{\vert Q_i\vert} \] where \( N_i \) is the number of questions correctly answered by learner \( i \), \( d_q \) is the difficulty of question \( q \) (defined as the ratio of the number of correct answers to the total number of answers), and \( Q_i \) is the set of all questions attempted by learner \( i \). - Average score and standard deviation calculation formula: \[ \bar{x}=\frac{1}{\vert I\vert}\sum_{i \in I} \bar{x}_i, \quad S_x = \sqrt{\frac{1}{\vert I\vert}\sum_{i \in I}(\bar{x}_i-\bar{x})^2} \] - Option weight calculation formula: \[ w_{oq}=\frac{\bar{x}_{coq}-\bar{x}_{ncoq}}{S_x\times\sqrt{C_{oq}\cdot N_{C_oq}}} \] where \( \bar{x}_{coq} \) and \( \bar{x}_{ncoq} \) are the average scores of learners who selected and did not select option \( o \) respectively, and \( C_{oq} \) and \( N_{C_oq} \) are the proportions of those who selected and did not select option \( o \) respectively. Through these improvements, the LOKT model can provide more accurate and detailed assessments of learners' knowledge states in cold - start scenarios, thus supporting the development and application of early personalized learning tools.

Beyond Right and Wrong: Mitigating Cold Start in Knowledge Tracing Using Large Language Model and Option Weight

Mitigating Cold-Start Problems in Knowledge Tracing with Large Language Models: an Attribute-aware Approach

CLST: Cold-Start Mitigation in Knowledge Tracing by Aligning a Generative Language Model as a Students' Knowledge Tracer

Language Model Can Do Knowledge Tracing: Simple but Effective Method to Integrate Language Model and Knowledge Tracing Task

A Deeper Knowledge Tracking Model Integrating Cognitive Theory and Learning Behavior

A Systematic Review of Knowledge Tracing and Large Language Models in Education: Opportunities, Issues, and Future Research

A Temporal Convolutional Knowledge Tracing Model Integrating Forgetting Factors and Item Response Theory

Difficulty-Focused Contrastive Learning for Knowledge Tracing with a Large Language Model-Based Difficulty Prediction

Pull together: Option-weighting-enhanced mixture-of-experts knowledge tracing

Knowledge ontology enhanced model for explainable knowledge tracing

Deep Trustworthy Knowledge Tracing

Learning Behavior-Oriented Knowledge Tracing

Do We Fully Understand Students' Knowledge States? Identifying and Mitigating Answer Bias in Knowledge Tracing

Interpretable Knowledge Tracing via Response Influence-based Counterfactual Reasoning

A Question-centric Multi-experts Contrastive Learning Framework for Improving the Accuracy and Interpretability of Deep Sequential Knowledge Tracing Models

[Inhalation anesthesia in a completely closed circuit. A method of reducing the amount of anesthetic].

No Task Left Behind: Multi-Task Learning of Knowledge Tracing and Option Tracing for Better Student Assessment

Adaptive meta-knowledge dictionary learning for incremental knowledge tracing

Can We Trust LLMs? Mitigate Overconfidence Bias in LLMs through Knowledge Transfer

Logistic Knowledge Tracing: A Constrained Framework for Learner Modeling

Interpretable Knowledge Tracing: Simple and Efficient Student Modeling with Causal Relations