Abstract:Large language models (LLMs) encode a vast amount of world knowledge acquired from massive text datasets. Recent studies have demonstrated that LLMs can assist an embodied agent in solving complex sequential decision making tasks by providing high-level instructions. However, interactions with LLMs can be time-consuming. In many practical scenarios, it requires a significant amount of storage space that can only be deployed on remote cloud servers. Additionally, using commercial LLMs can be costly since they may charge based on usage frequency. In this paper, we explore how to enable intelligent cost-effective interactions between a down stream task oriented agent and an LLM. We find that this problem can be naturally formulated by a Markov decision process (MDP), and propose When2Ask, a reinforcement learning based approach that learns when it is necessary to query LLMs for high-level instructions to accomplish a target task. On one side, When2Ask discourages unnecessary redundant interactions, while on the other side, it enables the agent to identify and follow useful instructions from the LLM. This enables the agent to halt an ongoing plan and transition to a more suitable one based on new environmental observations. Experiments on MiniGrid and Habitat environments that entail planning sub-goals demonstrate that When2Ask learns to solve target tasks with only a few necessary interactions with the LLM, significantly reducing interaction costs in testing environments compared with baseline methods. Our code is available at: <a class="link-external link-https" href="https://github.com/ZJLAB-AMMI/LLM4RL" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The paper primarily aims to address the following issues: 1. **Efficient Interaction**: How to achieve intelligent and cost-effective interaction between agents and large language models (LLMs). Existing methods often lead to unnecessary resource wastage (such as costs from frequent LLM queries, communication overhead, and inference time), or due to insufficient queries, agents fail to obtain useful instructions in time to adjust their plans to cope with complex and changing environments. 2. **Timely Consultation**: Determining when to request new high-level instructions from the LLM is a challenge that requires task-specific expertise. For example, when an agent encounters an obstacle, it should be able to recognize this situation and adjust its plan in time, consulting the LLM for advice on how to handle these obstacles. 3. **Reducing Non-informative Interactions**: The paper proposes a method aimed at reducing the number of unnecessary interactions between the agent and the LLM while ensuring that the agent can effectively complete the target task. To address the above challenges, the paper proposes a method called When2Ask, which is a reinforcement learning-based approach for training agents to learn when to request high-level instructions from the LLM to complete specific tasks. This method significantly reduces interaction costs in the test environment by minimizing unnecessary interactions and improves the success rate of task completion. Specifically, When2Ask adopts a Planner-Actor-Mediator framework, where: - **Planner**: Played by a pre-trained LLM, responsible for generating high-level instructions. - **Actor**: Executes the instructions provided by the Planner. - **Mediator**: Acts as an interface between the Planner and the Actor, deciding when to request new instructions from the Planner and converting observations into text descriptions that the LLM can understand. Through this approach, When2Ask not only reduces the interaction costs between the agent and the LLM but also improves the efficiency and success rate of task completion. Experimental results show that compared to baseline methods, this approach can significantly reduce interaction costs in various environments while maintaining or improving the success rate of task completion.

Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach

Enabling Efficient Interaction between an Algorithm Agent and an LLM: A Reinforcement Learning Approach

Large Language Model as a Policy Teacher for Training Reinforcement Learning Agents

From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems

Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration

Words as Beacons: Guiding RL Agents with High-Level Language Prompts

AgentBench: Evaluating LLMs as Agents

LLM-based Multi-Agent Reinforcement Learning: Current and Future Directions

Leave It to Large Language Models! Correction and Planning with Memory Integration

Embodied LLM Agents Learn to Cooperate in Organized Teams

LLM Augmented Hierarchical Agents

How Can LLM Guide RL? A Value-Based Approach

Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models

LLM A: Human in the Loop Large Language Models Enabled A Search for Robotics

Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents

Asking Before Action: Gather Information in Embodied Decision Making with Language Models

Enhancing LLMs for Power System Simulations: A Feedback-driven Multi-agent Framework

EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents

LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination

AGILE: A Novel Reinforcement Learning Framework of LLM Agents

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning