Interactive Task Planning with Language Models

Boyi Li,Philipp Wu,Pieter Abbeel,Jitendra Malik

2023-10-17

Abstract:An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals or distinct tasks, even during execution. However, most traditional methods require predefined module design, which makes it hard to generalize to different goals. Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain-specific pretrained models. To tackle this, we propose a simple framework that achieves interactive task planning with language models. Our system incorporates both high-level planning and low-level function execution via language. We verify the robustness of our system in generating novel high-level instructions for unseen objectives and its ease of adaptation to different tasks by merely substituting the task guidelines, without the need for additional complex prompt engineering. Furthermore, when the user sends a new request, our system is able to replan accordingly with precision based on the new request, task guidelines and previously executed steps. Please check more details on our <a class="link-external link-https" href="https://wuphilipp.github.io/itp_site" rel="external noopener nofollow">this https URL</a> and <a class="link-external link-https" href="https://youtu.be/TrKLuyv26_g" rel="external noopener nofollow">this https URL</a>.

Robotics,Artificial Intelligence,Computation and Language,Human-Computer Interaction

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to use large - language models (LLMs) to achieve interactive task planning with users. Specifically, the paper proposes a framework named ITP (Interactive Task Planning with language models), aiming to overcome the problem of poor generalization ability caused by pre - defined module design in traditional methods, as well as the problems of recent large - language - model - based methods that require a large amount of prompt engineering or domain - specific pre - training models. The ITP framework solves these problems in the following ways: 1. **Combination of high - level planning and low - level execution**: The system can not only generate high - level task plans, but also directly call the robot's low - level skills through the language model to execute these plans. 2. **Adaptation to new tasks and goals**: Even when new requests or goals are encountered during task execution, ITP can re - plan according to the new requests, the executed steps and the task guidelines. 3. **Reduction of prompt engineering**: Compared with other methods, ITP does not require complex prompt engineering, and can complete task planning and execution only by providing task guidelines. The paper verifies the effectiveness of ITP through a series of experiments. In particular, when dealing with user requests of different complexity, it shows its robustness and generalization ability in high - level task planning and successful execution. For example, when a user requests to make a drink that is not provided in the task guidelines, ITP can generate the correct high - level plan and execute it successfully. In addition, ITP can also handle the user's real - time feedback during task execution, and dynamically adjust the task plan to meet the user's new needs.

Interactive Task Planning with Language Models

Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents

Robot Task Planning Based on Large Language Model Representing Knowledge with Directed Graph Structures

CLIMB: Language-Guided Continual Learning for Task Planning with Iterative Model Building

One to rule them all: natural language to bind communication, perception and action

ProgPrompt: Generating Situated Robot Task Plans using Large Language Models

Enhancing Robot Task Planning and Execution through Multi-Layer Large Language Models

Nl2Hltl2Plan: Scaling Up Natural Language Understanding for Multi-Robots Through Hierarchical Temporal Logic Task Representation

Action Contextualization: Adaptive Task Planning and Action Tuning using Large Language Models

Lifelong Robot Learning with Human Assisted Language Planners

A framework for neurosymbolic robot action planning using large language models

Grounding Language Models in Autonomous Loco-manipulation Tasks

PLATO: Planning with LLMs and Affordances for Tool Manipulation

Integrating Intent Understanding and Optimal Behavior Planning for Behavior Tree Generation from Human Instructions

Towards Human Awareness in Robot Task Planning with Large Language Models

Interactive and Expressive Code-Augmented Planning with Large Language Models

Creative Robot Tool Use with Large Language Models

Task Planning for Robot Manipulator Using Natural Language Task Input with Large Language Models

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models