Abstract:Large language models have found utility in the domain of robot task planning and task decomposition. Nevertheless, the direct application of these models for instructing robots in task execution is not without its challenges. Limitations arise in handling more intricate tasks, encountering difficulties in effective interaction with the environment, and facing constraints in the practical executability of machine control instructions directly generated by such models. In response to these challenges, this research advocates for the implementation of a multi-layer large language model to augment a robot's proficiency in handling complex tasks. The proposed model facilitates a meticulous layer-by-layer decomposition of tasks through the integration of multiple large language models, with the overarching goal of enhancing the accuracy of task planning. Within the task decomposition process, a visual language model is introduced as a sensor for environment perception. The outcomes of this perception process are subsequently assimilated into the large language model, thereby amalgamating the task objectives with environmental information. This integration, in turn, results in the generation of robot motion planning tailored to the specific characteristics of the current environment. Furthermore, to enhance the executability of task planning outputs from the large language model, a semantic alignment method is introduced. This method aligns task planning descriptions with the functional requirements of robot motion, thereby refining the overall compatibility and coherence of the generated instructions. To validate the efficacy of the proposed approach, an experimental platform is established utilizing an intelligent unmanned vehicle. This platform serves as a means to empirically verify the proficiency of the multi-layer large language model in addressing the intricate challenges associated with both robot task planning and execution.

The Robot’s Understanding of Classification Concepts Based on Large Language Model

Decision-Making in Robotic Grasping with Large Language Models.

Enhancing Robot Task Planning and Execution through Multi-Layer Large Language Models

CLFR-M: Continual Learning Framework for Robots Via Human Feedback and Dynamic Memory

Large Language Models for Robotics: Opportunities, Challenges, and Perspectives

MHRC: Closed-loop Decentralized Multi-Heterogeneous Robot Collaboration with Large Language Models

RoboCoder: Robotic Learning from Basic Skills to General Tasks with Large Language Models

Large Language Models for Robotics: A Survey

A Smart Interactive Camera Robot Based on Large Language Models

Leveraging Large (Visual) Language Models for Robot 3D Scene Understanding

Applying Large Language Model to a Control System for Multi-Robot Task Assignment

LLM as A Robotic Brain: Unifying Egocentric Memory and Control

Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models

Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model

Action Contextualization: Adaptive Task Planning and Action Tuning using Large Language Models

Leveraging Large Language Models for Robot 3D Scene Understanding

Empowering Robot Path Planning with Large Language Models: osmAG Map Topology & Hierarchy Comprehension with LLMs

LLM Granularity for On-the-Fly Robot Control