DELTA: Decomposed Efficient Long-Term Robot Task Planning using Large Language Models

Yuchen Liu,Luigi Palmieri,Sebastian Koch,Ilche Georgievski,Marco Aiello
2024-09-13
Abstract:Recent advancements in Large Language Models (LLMs) have sparked a revolution across many research fields. In robotics, the integration of common-sense knowledge from LLMs into task and motion planning has drastically advanced the field by unlocking unprecedented levels of context awareness. Despite their vast collection of knowledge, large language models may generate infeasible plans due to hallucinations or missing domain information. To address these challenges and improve plan feasibility and computational efficiency, we introduce DELTA, a novel LLM-informed task planning approach. By using scene graphs as environment representations within LLMs, DELTA achieves rapid generation of precise planning problem descriptions. To enhance planning performance, DELTA decomposes long-term task goals with LLMs into an autoregressive sequence of sub-goals, enabling automated task planners to efficiently solve complex problems. In our extensive evaluation, we show that DELTA enables an efficient and fully automatic task planning pipeline, achieving higher planning success rates and significantly shorter planning times compared to the state of the art.
Robotics,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the challenges faced by mobile robots in long-term task planning within large and complex environments. Specifically, the paper points out: 1. **Limitations of existing methods**: Although large language models (LLMs) have shown great potential in task and motion planning (TAMP) for robots, directly using pre-trained LLMs to generate action plans for robots often results in infeasible plans because LLMs may hallucinate or lack domain-specific information. 2. **Complexity of environment representation**: In large and complex environments, efficient environment representation is key for robots to understand semantic information. However, mapping mid-level perception representations (such as 2D semantic segmentation) to higher-level abstractions (such as environmental topology and semantic relationships between objects) is both costly and complex. 3. **Difficulty of long-term task planning**: Existing TAMP methods mainly focus on short-term tasks, while long-term task planning is difficult to solve due to the increased complexity of the problem. To address these issues, the paper proposes DELTA (Decomposed Efficient Long-Term Task planning), a new method for efficient long-term task planning using large language models and scene graphs (SGs). The main contributions of DELTA include: 1. **Combining LLMs and SGs**: By inputting scene graphs into LLMs, DELTA can extract actionable and semantic knowledge from LLMs and combine it with environmental topology. This enables DELTA to solve complex planning problems in previously unseen domains. 2. **Task decomposition strategy**: DELTA uses LLMs to decompose long-term task goals into multiple sub-goals and employs an automated task planner to solve these sub-problems in an autoregressive manner. This improves the success rate, plan quality, and planning time. 3. **Efficient task planning process**: DELTA achieves task planning through five steps: domain generation, scene graph pruning, problem generation, goal decomposition, and autoregressive sub-task planning. Each step leverages the capabilities of LLMs, resulting in efficient task planning. In summary, DELTA aims to improve the ability of robots to perform long-term task planning in large and complex environments by combining large language models and scene graphs, addressing the issues present in existing methods.