Simulating Student Interactions with Two-stage Imitation Learning for Intelligent Educational Systems

Guanhao Zhao,Zhenya Huang,Yan Zhuang,Jiayu Liu,Qi Liu,Zhiding Liu,Jinze Wu,Enhong Chen
DOI: https://doi.org/10.1145/3583780.3615060
2023-01-01
Abstract:The fundamental task of intelligent educational systems is to offer adaptive learning services to students, such as exercise recommendations and computerized adaptive testing. However, optimizing required models in these systems would always encounter the collection difficulty of high-quality interaction data in practice. Therefore, establishing a student simulator is of great value since it can generate valid interactions to help optimize models. Existing advances have achieved success but generally suffer from exposure bias and overlook long-term intentions. To tackle these problems, we propose a novel Direct-Adversarial Imitation Student Simulator (DAISim) by formulating it as a Markov Decision Process (MDP), which unifies the workflow of the simulator in training and generating to alleviate the exposure bias and single-step optimization problems. To construct the intentions underlying the complex student interactions, we first propose a direct imitation strategy to mimic the interactions with a simple reward function. Then, we propose an adversarial imitation strategy to learn a rational distribution with the reward given by a parameterized discriminator. Furthermore, we optimize the discriminator in adversarial imitation in a pairwise manner, and the theoretical analysis shows that the pairwise discriminator would improve the generation quality. We conduct extensive experiments on real-world datasets, where the results demonstrate that our DAISim can simulate high-quality student interactions whose distribution is close to real distribution and can promote several downstream services.
What problem does this paper attempt to address?