EduAgent: Generative Student Agents in Learning

Songlin Xu,Xinyu Zhang,Lianhui Qin
2024-03-24
Abstract:Student simulation in online education is important to address dynamic learning behaviors of students with diverse backgrounds. Existing simulation models based on deep learning usually need massive training data, lacking prior knowledge in educational contexts. Large language models (LLMs) may contain such prior knowledge since they are pre-trained from a large corpus. However, because student behaviors are dynamic and multifaceted with individual differences, directly prompting LLMs is not robust nor accurate enough to capture fine-grained interactions among diverse student personas, learning behaviors, and learning outcomes. This work tackles this problem by presenting a newly annotated fine-grained large-scale dataset and proposing EduAgent, a novel generative agent framework incorporating cognitive prior knowledge (i.e., theoretical findings revealed in cognitive science) to guide LLMs to first reason correlations among various behaviors and then make simulations. Our two experiments show that EduAgent could not only mimic and predict learning behaviors of real students but also generate realistic learning behaviors of virtual students without real data.
Computers and Society,Artificial Intelligence,Computation and Language,Human-Computer Interaction,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the problem of simulating student learning behaviors in online education. Specifically, existing deep learning-based student behavior simulation models typically require a large amount of training data and lack prior knowledge in the educational context. Large Language Models (LLMs) may contain such prior knowledge as they are pre-trained from extensive corpora. However, since student behaviors are dynamic, multifaceted, and vary among individuals, directly prompting LLMs cannot robustly or accurately capture the nuanced interactions between different student roles, learning behaviors, and learning outcomes. To overcome these issues, the authors propose a new fine-grained large-scale dataset and a new generative agent framework named EduAgent. EduAgent combines cognitive prior knowledge (i.e., theoretical findings revealed in cognitive science) to guide LLMs to first reason about the correlations between various behaviors and then perform simulations. Through this approach, EduAgent can not only mimic and predict the learning behaviors of real students but also generate realistic learning behaviors of virtual students, even without real data. ### Main Contributions: 1. **Dataset**: Provides a new large-scale, fine-grained annotated learning behavior dataset, including data from 311 real students and 705 virtual students. 2. **Generative Agent Framework**: Developed an open-source generative agent framework, EduAgent, which follows cognitive prior knowledge to realistically simulate learning behaviors in online education. 3. **Experimental Validation**: Conducted comprehensive experiments to validate the effectiveness of the EduAgent framework and evaluated the capability of state-of-the-art LLMs in modeling fine-grained learning behaviors. ### Key Technologies: - **Cognitive Prior Knowledge**: Utilizes theoretical findings in cognitive science to guide LLMs in reasoning about the correlations between student behaviors. - **Fine-Grained Data**: The dataset includes multi-dimensional records such as students' gaze trajectories, mouse control behaviors, and six different cognitive states. - **Modular Architecture**: The EduAgent framework stores different behaviors in different memory layers and infers how these behaviors are modulated by student characteristics and course content to more accurately simulate the learning process. ### Experimental Results: - **Personalized Student Behavior Prediction**: EduAgent can accurately predict the learning behaviors and test results of real students, even with only a small amount of historical data. - **Virtual Student Generation**: The generated virtual students exhibit behavior patterns consistent with real students, even without real training data. In summary, this paper addresses the challenge of simulating student behaviors in online education by combining cognitive prior knowledge and a generative agent framework, providing new insights for the development of intelligent teaching systems.