Abstract:Recent advancements in Large Language Models (LLMs) have sparked a revolution across many research fields. In robotics, the integration of common-sense knowledge from LLMs into task and motion planning has drastically advanced the field by unlocking unprecedented levels of context awareness. Despite their vast collection of knowledge, large language models may generate infeasible plans due to hallucinations or missing domain information. To address these challenges and improve plan feasibility and computational efficiency, we introduce DELTA, a novel LLM-informed task planning approach. By using scene graphs as environment representations within LLMs, DELTA achieves rapid generation of precise planning problem descriptions. To enhance planning performance, DELTA decomposes long-term task goals with LLMs into an autoregressive sequence of sub-goals, enabling automated task planners to efficiently solve complex problems. In our extensive evaluation, we show that DELTA enables an efficient and fully automatic task planning pipeline, achieving higher planning success rates and significantly shorter planning times compared to the state of the art.

What problem does this paper attempt to address?

The paper attempts to address the challenges faced by mobile robots in long-term task planning within large and complex environments. Specifically, the paper points out: 1. **Limitations of existing methods**: Although large language models (LLMs) have shown great potential in task and motion planning (TAMP) for robots, directly using pre-trained LLMs to generate action plans for robots often results in infeasible plans because LLMs may hallucinate or lack domain-specific information. 2. **Complexity of environment representation**: In large and complex environments, efficient environment representation is key for robots to understand semantic information. However, mapping mid-level perception representations (such as 2D semantic segmentation) to higher-level abstractions (such as environmental topology and semantic relationships between objects) is both costly and complex. 3. **Difficulty of long-term task planning**: Existing TAMP methods mainly focus on short-term tasks, while long-term task planning is difficult to solve due to the increased complexity of the problem. To address these issues, the paper proposes DELTA (Decomposed Efficient Long-Term Task planning), a new method for efficient long-term task planning using large language models and scene graphs (SGs). The main contributions of DELTA include: 1. **Combining LLMs and SGs**: By inputting scene graphs into LLMs, DELTA can extract actionable and semantic knowledge from LLMs and combine it with environmental topology. This enables DELTA to solve complex planning problems in previously unseen domains. 2. **Task decomposition strategy**: DELTA uses LLMs to decompose long-term task goals into multiple sub-goals and employs an automated task planner to solve these sub-problems in an autoregressive manner. This improves the success rate, plan quality, and planning time. 3. **Efficient task planning process**: DELTA achieves task planning through five steps: domain generation, scene graph pruning, problem generation, goal decomposition, and autoregressive sub-task planning. Each step leverages the capabilities of LLMs, resulting in efficient task planning. In summary, DELTA aims to improve the ability of robots to perform long-term task planning in large and complex environments by combining large language models and scene graphs, addressing the issues present in existing methods.

DELTA: Decomposed Efficient Long-Term Robot Task Planning using Large Language Models

MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model

FLTRNN: Faithful Long-Horizon Task Planning for Robotics with Large Language Models

SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models

Towards Human Awareness in Robot Task Planning with Large Language Models

Task and Motion Planning with Large Language Models for Object Rearrangement

Fast and Accurate Task Planning using Neuro-Symbolic Language Models and Multi-level Goal Decomposition

Robot Task Planning Based on Large Language Model Representing Knowledge with Directed Graph Structures

Large Language Models as Commonsense Knowledge for Large-Scale Task Planning

Enhancing Robot Task Planning and Execution through Multi-Layer Large Language Models

Improving Planning with Large Language Models: A Modular Agentic Architecture

ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning

Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning

Nl2Hltl2Plan: Scaling Up Natural Language Understanding for Multi-Robots Through Hierarchical Temporal Logic Task Representation

LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models

Embodied AI in Mobile Robots: Coverage Path Planning with Large Language Models

Long-Horizon Planning for Multi-Agent Robots in Partially Observable Environments

Leveraging Environment Interaction for Automated PDDL Translation and Planning with Large Language Models

Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents