Abstract:Planning and performing interactive tasks, such as conducting experiments to determine the melting point of an unknown substance, is straightforward for humans but poses significant challenges for autonomous agents. We introduce ReasonPlanner, a novel generalist agent designed for reflective thinking, planning, and interactive reasoning. This agent leverages LLMs to plan hypothetical trajectories by building a World Model based on a Temporal Knowledge Graph. The agent interacts with the environment using a natural language actor-critic module, where the actor translates the imagined trajectory into a sequence of actionable steps, and the critic determines if replanning is necessary. ReasonPlanner significantly outperforms previous state-of-the-art prompting-based methods on the ScienceWorld benchmark by more than 1.8 times, while being more sample-efficient and interpretable. It relies solely on frozen weights thus requiring no gradient updates. ReasonPlanner can be deployed and utilized without specialized knowledge of Machine Learning, making it accessible to a wide range of users.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper attempts to address the challenges of planning and executing complex interactive tasks in dynamic environments. Specifically, the paper proposes a novel general - purpose agent named ReasonPlanner, aiming to enhance autonomous planning capabilities, especially when conducting experimental tasks in simulated environments, such as determining the melting point of an unknown substance. #### Main problems 1. **Complex task planning**: For humans, planning and executing complex interactive tasks (such as experimental operations) are relatively simple, but it is very challenging for autonomous agents. Traditional reinforcement learning (RL) methods perform poorly when dealing with large - scale non - discrete action spaces, especially in text - based environments where the action space may grow polynomially. 2. **Dynamic environmental changes**: In dynamic environments, agents need to be able to "foresee" future scenarios and replan according to environmental changes. Existing RL and large - language - model (LLM) methods have limitations in this regard. 3. **Sample efficiency and interpretability**: Existing methods usually require a large amount of sample data for training and lack interpretability, making them difficult to understand and debug. #### Solutions 1. **Temporal Knowledge Graph (TKG)**: ReasonPlanner uses the Temporal Knowledge Graph to store and update environmental information as its World Model (WM). This enables the agent to construct an internal environmental representation and use it to predict future scenarios. 2. **Natural - language actor - critic module**: The agent interacts with the environment through a natural - language actor - critic module. The actor converts the imagined trajectory into a series of executable steps, while the critic evaluates the difference between the actual and predicted results and decides whether replanning is required. 3. **No weight update required**: ReasonPlanner relies on pre - trained LLM and does not require gradient updates, thereby improving sample efficiency and interpretability. #### Experimental results ReasonPlanner significantly outperforms existing prompt - based methods in the ScienceWorld benchmark, with an average score of over 65 (out of 100) and achieving full marks in multiple tasks. In addition, ReasonPlanner also performs well in terms of sample efficiency and interpretability, making it easier to deploy and use. ### Summary By combining the Temporal Knowledge Graph and large - language models, ReasonPlanner addresses the challenges of planning and executing complex tasks in dynamic environments, improves sample efficiency and interpretability, making it a promising autonomous planning solution.

ReasonPlanner: Enhancing Autonomous Planning in Dynamic Environments with Temporal Knowledge Graphs and LLMs

Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents

ConceptAgent: LLM-Driven Precondition Grounding and Tree Search for Robust Task Planning and Execution

Reasoning with Language Model is Planning with World Model

A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models

Learning adaptive planning representations with natural language guidance

Smart Language Agents in Real-World Planning

Improving Planning with Large Language Models: A Modular Agentic Architecture

Describe, Explain, Plan and Select: Interactive Planning with LLMs Enables Open-World Multi-Task Agents.

Dynamic Planning for LLM-based Graphical User Interface Automation

Spatial Reasoning and Planning for Deep Embodied Agents

AdaPlanner: Adaptive Planning from Feedback with Language Models

Multi-Modal Grounded Planning and Efficient Replanning For Learning Embodied Agents with A Few Examples

Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents

Anytime Game-Theoretic Planning with Active Reasoning about Humans' Latent States for Human-Centered Robots

Sequential Planning in Large Partially Observable Environments guided by LLMs

Long-Horizon Planning for Multi-Agent Robots in Partially Observable Environments

RePLan: Robotic Replanning with Perception and Language Models

Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency

PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning

Inner Monologue: Embodied Reasoning through Planning with Language Models