Abstract:Large language models (LLMs) have recently demonstrated the potential in acting as autonomous agents for sequential decision-making tasks. However, most existing methods either take actions greedily without planning or rely on static plans that are not adaptable to environmental feedback. Consequently, the sequential decision-making performance of LLM agents degenerates with problem complexity and plan horizons increase. We propose a closed-loop approach, AdaPlanner, which allows the LLM agent to refine its self-generated plan adaptively in response to environmental feedback. In AdaPlanner, the LLM agent adaptively refines its plan from feedback with both in-plan and out-of-plan refinement strategies. To mitigate hallucination, we develop a code-style LLM prompt structure that facilitates plan generation across a variety of tasks, environments, and agent capabilities. Furthermore, we propose a skill discovery mechanism that leverages successful plans as few-shot exemplars, enabling the agent to plan and refine with fewer task demonstrations. Our experiments in the ALFWorld and MiniWoB++ environments demonstrate that AdaPlanner outperforms state-of-the-art baselines by 3.73% and 4.11% while utilizing 2x and 600x fewer samples, respectively.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the limitations encountered by existing large - language - model - based (LLMs) autonomous agents in sequential decision - making tasks. Specifically: 1. **Lack of Adaptability**: Existing methods either adopt a greedy strategy (i.e., acting directly without planning) or rely on static plans that cannot be adjusted according to environmental feedback. This leads to a decline in the sequential decision - making performance of LLM agents when the problem complexity and planning time horizon increase. 2. **Insufficient Utilization of Feedback**: Although some methods attempt to adjust decisions through environmental feedback, they usually only update the current execution action, rather than the entire plan. This means that these methods may make short - term adaptations to environmental changes, but may have adverse effects in the long run. 3. **Inaccurate Initial Planning**: Even if the locally optimal action is taken at each step, if there are errors in the initial plan, it may eventually lead to task failure or non - completion. To solve the above problems, the authors propose a closed - loop method - AdaPlanner. AdaPlanner allows LLM agents to adaptively refine their automatically generated plans according to environmental feedback. It achieves this goal through the following mechanisms: - **Planning and Refinement**: The LLM agents in AdaPlanner are not only responsible for generating the initial plan, but also able to dynamically adjust the plan during execution according to environmental feedback. This includes two types of refinement strategies: **in - plan refinement** (dealing with expected feedback) and **out - of - plan refinement** (dealing with unexpected feedback). - **Code - Style Prompt Structure**: To reduce hallucinations (i.e., the model generating untrue or irrelevant information), AdaPlanner adopts a code - style LLM prompt structure, which is helpful for generating plans under multiple tasks, environments, and agent capabilities. - **Skill Discovery Mechanism**: AdaPlanner also introduces a skill discovery mechanism, which uses successful plans as few - shot examples, enabling agents to plan and refine with fewer task demonstrations. Experimental results show that AdaPlanner outperforms existing state - of - the - art baseline methods in both ALFWorld and MiniWoB++ environments, increasing the success rate by 3.73% and 4.11% respectively, while using 1/2 and 1/600 of the number of samples used by other methods respectively. These results demonstrate the effectiveness and efficiency of AdaPlanner in using environmental feedback for plan refinement.

AdaPlanner: Adaptive Planning from Feedback with Language Models

AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback

Learning adaptive planning representations with natural language guidance

LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models

Improving Planning with Large Language Models: A Modular Agentic Architecture

Tree-Planner: Efficient Close-loop Task Planning with Large Language Models

ADaPT: As-Needed Decomposition and Planning with Language Models

Sequential Planning in Large Partially Observable Environments guided by LLMs

Query-Efficient Planning with Language Models

LLM+P: Empowering Large Language Models with Optimal Planning Proficiency

Enhancing Robot Task Planning: Integrating Environmental Information and Feedback Insights Through Large Language Models

Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents

Can We Rely on LLM Agents to Draft Long-Horizon Plans? Let's Take TravelPlanner as an Example

LASP: Surveying the State-of-the-Art in Large Language Model-Assisted AI Planning

Inner Monologue: Embodied Reasoning through Planning with Language Models

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving

Leveraging Environment Interaction for Automated PDDL Translation and Planning with Large Language Models

Interactive and Expressive Code-Augmented Planning with Large Language Models

ReasonPlanner: Enhancing Autonomous Planning in Dynamic Environments with Temporal Knowledge Graphs and LLMs

AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation