Towards Generalizable Agents in Text-Based Educational Environments: A Study of Integrating RL with LLMs

Bahar Radmehr,Adish Singla,Tanja Käser
2024-04-29
Abstract:There has been a growing interest in developing learner models to enhance learning and teaching experiences in educational environments. However, existing works have primarily focused on structured environments relying on meticulously crafted representations of tasks, thereby limiting the agent's ability to generalize skills across tasks. In this paper, we aim to enhance the generalization capabilities of agents in open-ended text-based learning environments by integrating Reinforcement Learning (RL) with Large Language Models (LLMs). We investigate three types of agents: (i) RL-based agents that utilize natural language for state and action representations to find the best interaction strategy, (ii) LLM-based agents that leverage the model's general knowledge and reasoning through prompting, and (iii) hybrid LLM-assisted RL agents that combine these two strategies to improve agents' performance and generalization. To support the development and evaluation of these agents, we introduce PharmaSimText, a novel benchmark derived from the PharmaSim virtual pharmacy environment designed for practicing diagnostic conversations. Our results show that RL-based agents excel in task completion but lack in asking quality diagnostic questions. In contrast, LLM-based agents perform better in asking diagnostic questions but fall short of completing the task. Finally, hybrid LLM-assisted RL agents enable us to overcome these limitations, highlighting the potential of combining RL and LLMs to develop high-performing agents for open-ended learning environments.
Machine Learning,Artificial Intelligence,Computers and Society
What problem does this paper attempt to address?
The paper aims to address the problem of how to create intelligent agents with stronger generalization capabilities in an open-text basic education environment. Specifically, the authors attempt to enhance the skill transferability of agents across different tasks by combining Reinforcement Learning (RL) and Large Language Models (LLMs). The core contributions of the paper include: 1. **Proposing a new benchmark**: The researchers introduce a new benchmark called PharmaSimText, derived from a virtual pharmacy environment PharmaSim used for practicing diagnostic dialogues. This benchmark includes over 500 scenario variations and can be used to develop and evaluate learning agents. 2. **Designing three types of agents**: The researchers explore three types of agents: RL-based agents, LLM-based agents, and hybrid agents that combine both. These agents are designed to perform tasks on the PharmaSimText benchmark, such as conducting effective diagnostic dialogues with patients and making accurate diagnoses. - **RL-based agents**: Use natural language to represent states and actions, seeking the optimal interaction strategy. - **LLM-based agents**: Utilize the general knowledge and reasoning capabilities of the model to guide action selection through prompting. - **Hybrid LLM-assisted RL agents**: Combine the strengths of the above two strategies to improve agent performance and generalization capabilities. 3. **Experimental evaluation**: The researchers conducted extensive evaluations of the three types of agents to test their ability to conduct effective diagnostic dialogues and achieve accurate diagnoses. The evaluation also specifically focused on the performance differences of the agents in different patient configurations. Through this work, the paper aims to determine which type of agent can most effectively conduct diagnostic dialogues and make accurate diagnoses on the PharmaSimText benchmark. It also explores the impact of reflective prompting on the performance of LLM-based agents, as well as the variations in diagnostic performance and dialogue quality of different types of agents when facing different patients.