LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

Shu Wang,Muzhi Han,Ziyuan Jiao,Zeyu Zhang,Ying Nian Wu,Song-Chun Zhu,Hangxin Liu
2024-08-21
Abstract:Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation. These domain-specific and labor-intensive modules are limited in addressing emerging tasks in real-world settings. Here, we present LLM^3, a novel Large Language Model (LLM)-based TAMP framework featuring a domain-independent interface. Specifically, we leverage the powerful reasoning and planning capabilities of pre-trained LLMs to propose symbolic action sequences and select continuous action parameters for motion planning. Crucially, LLM^3 incorporates motion planning feedback through prompting, allowing the LLM to iteratively refine its proposals by reasoning about motion failure. Consequently, LLM^3 interfaces between task planning and motion planning, alleviating the intricate design process of handling domain-specific messages between them. Through a series of simulations in a box-packing domain, we quantitatively demonstrate the effectiveness of LLM^3 in solving TAMP problems and the efficiency in selecting action parameters. Ablation studies underscore the significant contribution of motion failure reasoning to the success of LLM^3. Furthermore, we conduct qualitative experiments on a physical manipulator, demonstrating the practical applicability of our approach in real-world settings.
Robotics,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the interface design challenge in Task and Motion Planning (TAMP) for robots, specifically how to establish an effective connection between symbolic task planning and continuous motion planning. Traditional methods rely on manually designed modules to achieve this interface, but these modules are often domain-specific, difficult to adapt to new environments, and require extensive manual design work. Therefore, this paper proposes the LLM3 framework, which leverages pre-trained Large Language Models (LLM) to replace traditional manually designed modules, enabling more general task planning, action parameter generation, and motion failure reasoning. Specifically, the LLM3 framework addresses the TAMP problem in the following ways: 1. **Task Planning**: Proposes symbolic action sequences using pre-trained LLM. 2. **Action Parameter Generation**: Generates continuous action parameters to ensure the actions are feasible at the motion level. 3. **Motion Failure Reasoning**: Iteratively reasons about motion planning feedback to continuously optimize the proposed action sequences and parameters. The main advantage of this framework is that it does not require manually designed symbolic domain files or domain-specific parameter samplers, but instead relies on the knowledge of pre-trained LLM to automatically accomplish the above tasks. Additionally, LLM3 can specifically improve action sequences and parameters based on the two main failure modes in motion planning feedback (collision and inaccessibility), thereby enhancing planning efficiency. Experimental results show that LLM3 can effectively solve TAMP problems in both simulated and real environments.