Abstract:There has been a growing interest in developing learner models to enhance learning and teaching experiences in educational environments. However, existing works have primarily focused on structured environments relying on meticulously crafted representations of tasks, thereby limiting the agent's ability to generalize skills across tasks. In this paper, we aim to enhance the generalization capabilities of agents in open-ended text-based learning environments by integrating Reinforcement Learning (RL) with Large Language Models (LLMs). We investigate three types of agents: (i) RL-based agents that utilize natural language for state and action representations to find the best interaction strategy, (ii) LLM-based agents that leverage the model's general knowledge and reasoning through prompting, and (iii) hybrid LLM-assisted RL agents that combine these two strategies to improve agents' performance and generalization. To support the development and evaluation of these agents, we introduce PharmaSimText, a novel benchmark derived from the PharmaSim virtual pharmacy environment designed for practicing diagnostic conversations. Our results show that RL-based agents excel in task completion but lack in asking quality diagnostic questions. In contrast, LLM-based agents perform better in asking diagnostic questions but fall short of completing the task. Finally, hybrid LLM-assisted RL agents enable us to overcome these limitations, highlighting the potential of combining RL and LLMs to develop high-performing agents for open-ended learning environments.

What problem does this paper attempt to address?

The paper aims to address the problem of how to create intelligent agents with stronger generalization capabilities in an open-text basic education environment. Specifically, the authors attempt to enhance the skill transferability of agents across different tasks by combining Reinforcement Learning (RL) and Large Language Models (LLMs). The core contributions of the paper include: 1. **Proposing a new benchmark**: The researchers introduce a new benchmark called PharmaSimText, derived from a virtual pharmacy environment PharmaSim used for practicing diagnostic dialogues. This benchmark includes over 500 scenario variations and can be used to develop and evaluate learning agents. 2. **Designing three types of agents**: The researchers explore three types of agents: RL-based agents, LLM-based agents, and hybrid agents that combine both. These agents are designed to perform tasks on the PharmaSimText benchmark, such as conducting effective diagnostic dialogues with patients and making accurate diagnoses. - **RL-based agents**: Use natural language to represent states and actions, seeking the optimal interaction strategy. - **LLM-based agents**: Utilize the general knowledge and reasoning capabilities of the model to guide action selection through prompting. - **Hybrid LLM-assisted RL agents**: Combine the strengths of the above two strategies to improve agent performance and generalization capabilities. 3. **Experimental evaluation**: The researchers conducted extensive evaluations of the three types of agents to test their ability to conduct effective diagnostic dialogues and achieve accurate diagnoses. The evaluation also specifically focused on the performance differences of the agents in different patient configurations. Through this work, the paper aims to determine which type of agent can most effectively conduct diagnostic dialogues and make accurate diagnoses on the PharmaSimText benchmark. It also explores the impact of reflective prompting on the performance of LLM-based agents, as well as the variations in diagnostic performance and dialogue quality of different types of agents when facing different patients.

Towards Generalizable Agents in Text-Based Educational Environments: A Study of Integrating RL with LLMs

Reinforcement Learning Problem Solving with Large Language Models

Evaluating large language models as agents in the clinic

Large Language Models as Agents in the Clinic

Words as Beacons: Guiding RL Agents with High-Level Language Prompts

Large Language Model as a Policy Teacher for Training Reinforcement Learning Agents

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Synergistic Simulations: Multi-Agent Problem Solving with Large Language Models

Mental Modeling of Reinforcement Learning Agents by Language Models

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning

LLM-based Multi-Agent Reinforcement Learning: Current and Future Directions

Collaborating with language models for embodied reasoning

LLM Harmony: Multi-Agent Communication for Problem Solving

Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts

Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations

AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments

Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach

Simulating Classroom Education with LLM-Empowered Agents

Survey on Large Language Model-Enhanced Reinforcement Learning: Concept, Taxonomy, and Methods

Broadening Access to Simulations for End-Users via Large Language Models: Challenges and Opportunities

Language Guided Exploration for RL Agents in Text Environments