Reinforcement Learning Problem Solving with Large Language Models

Sina Gholamian,Domingo Huh

2024-04-29

Abstract:Large Language Models (LLMs) encapsulate an extensive amount of world knowledge, and this has enabled their application in various domains to improve the performance of a variety of Natural Language Processing (NLP) tasks. This has also facilitated a more accessible paradigm of conversation-based interactions between humans and AI systems to solve intended problems. However, one interesting avenue that shows untapped potential is the use of LLMs as Reinforcement Learning (RL) agents to enable conversational RL problem solving. Therefore, in this study, we explore the concept of formulating Markov Decision Process-based RL problems as LLM prompting tasks. We demonstrate how LLMs can be iteratively prompted to learn and optimize policies for specific RL tasks. In addition, we leverage the introduced prompting technique for episode simulation and Q-Learning, facilitated by LLMs. We then show the practicality of our approach through two detailed case studies for "Research Scientist" and "Legal Matter Intake" workflows.

Artificial Intelligence,Computation and Language

What problem does this paper attempt to address?

This paper discusses how to use large-scale language models (LLMs) as agents for reinforcement learning (RL) to solve Markov decision process (MDP) problems. Currently, LLMs have shown excellent performance in natural language processing tasks and are capable of conversing with humans. However, their potential as RL agents for problem-solving has not been fully explored. In this study, the authors propose an iterative prompting strategy that transforms RL problems into prompting tasks for LLMs. In this way, LLMs can gradually learn and optimize policies for specific RL tasks. Additionally, they utilize this prompting technique for simulation and Q-learning, enabling LLMs to participate in policy learning and obtain optimal policy results from LLMs. The paper demonstrates the practicality of this approach through two case studies: "research scientist" and "legal transaction processing" workflows. These cases show that LLMs can find optimal workflows within no more than two iterations. In summary, this paper attempts to address how to leverage the inherent knowledge and reasoning capabilities of LLMs to solve RL problems through iterative prompting, thus achieving a more intuitive and user-friendly interaction between AI systems and human users. This approach may have potentially transformative impacts on the optimization of RL problems.

Reinforcement Learning Problem Solving with Large Language Models

On the Modeling Capabilities of Large Language Models for Sequential Decision Making

Do Large Language Models with Reasoning and Acting Meet the Needs of Task-Oriented Dialogue?

Efficient Reinforcement Learning with Large Language Model Priors

Survey on Large Language Model-Enhanced Reinforcement Learning: Concept, Taxonomy, and Methods

Integrating Large Language Models and Reinforcement Learning for Non-Linear Reasoning

Using Large Language Models to Automate and Expedite Reinforcement Learning with Reward Machine

LLM-based Multi-Agent Reinforcement Learning: Current and Future Directions

LLM4RL: Enhancing Reinforcement Learning with Large Language Models

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

Exploring the Path from Instructions to Rewards with Large Language Models in Instance-Based Learning

Mental Modeling of Reinforcement Learning Agents by Language Models

Reasoning with Large Language Models, a Survey

Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting

Large Language Model as a Policy Teacher for Training Reinforcement Learning Agents

From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems

Reinforcement Learning Enhanced LLMs: A Survey

Eliciting Problem Specifications via Large Language Models

Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach

Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks

Learning to Program with Natural Language