Interactive Task Planning with Language Models

Boyi Li,Philipp Wu,Pieter Abbeel,Jitendra Malik
2023-10-17
Abstract:An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals or distinct tasks, even during execution. However, most traditional methods require predefined module design, which makes it hard to generalize to different goals. Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain-specific pretrained models. To tackle this, we propose a simple framework that achieves interactive task planning with language models. Our system incorporates both high-level planning and low-level function execution via language. We verify the robustness of our system in generating novel high-level instructions for unseen objectives and its ease of adaptation to different tasks by merely substituting the task guidelines, without the need for additional complex prompt engineering. Furthermore, when the user sends a new request, our system is able to replan accordingly with precision based on the new request, task guidelines and previously executed steps. Please check more details on our <a class="link-external link-https" href="https://wuphilipp.github.io/itp_site" rel="external noopener nofollow">this https URL</a> and <a class="link-external link-https" href="https://youtu.be/TrKLuyv26_g" rel="external noopener nofollow">this https URL</a>.
Robotics,Artificial Intelligence,Computation and Language,Human-Computer Interaction
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use large - language models (LLMs) to achieve interactive task planning with users. Specifically, the paper proposes a framework named ITP (Interactive Task Planning with language models), aiming to overcome the problem of poor generalization ability caused by pre - defined module design in traditional methods, as well as the problems of recent large - language - model - based methods that require a large amount of prompt engineering or domain - specific pre - training models. The ITP framework solves these problems in the following ways: 1. **Combination of high - level planning and low - level execution**: The system can not only generate high - level task plans, but also directly call the robot's low - level skills through the language model to execute these plans. 2. **Adaptation to new tasks and goals**: Even when new requests or goals are encountered during task execution, ITP can re - plan according to the new requests, the executed steps and the task guidelines. 3. **Reduction of prompt engineering**: Compared with other methods, ITP does not require complex prompt engineering, and can complete task planning and execution only by providing task guidelines. The paper verifies the effectiveness of ITP through a series of experiments. In particular, when dealing with user requests of different complexity, it shows its robustness and generalization ability in high - level task planning and successful execution. For example, when a user requests to make a drink that is not provided in the task guidelines, ITP can generate the correct high - level plan and execute it successfully. In addition, ITP can also handle the user's real - time feedback during task execution, and dynamically adjust the task plan to meet the user's new needs.