Abstract:Robotics learning highly relies on human expertise and efforts, such as demonstrations, design of reward functions in reinforcement learning, performance evaluation using human feedback, etc. However, reliance on human assistance can lead to expensive learning costs and make skill learning difficult to scale. In this work, we introduce the Large Language Model Supervised Robotics Text2Skill Autonomous Learning (ARO) framework, which aims to replace human participation in the robot skill learning process with large-scale language models that incorporate reward function design and performance evaluation. We provide evidence that our approach enables fully autonomous robot skill learning, capable of completing partial tasks without human intervention. Furthermore, we also analyze the limitations of this approach in task understanding and optimization stability.

What problem does this paper attempt to address?

The paper aims to address the issue of high dependency on human experts in the process of robot skill learning. In traditional methods, the design of robot behaviors, reward functions, and result evaluations all require significant human intervention, which is not only time-consuming but also difficult to scale. To solve this problem, the researchers proposed the ARO (Large Language Model Supervised Robotics Text2Skill Autonomous Learning) framework. The core idea of the ARO framework is to use large language models (LLM) to replace human involvement in the robot skill learning process. Specifically, the framework can automatically generate reward functions and train reinforcement learning (RL) agents based on these reward functions to control robots to perform tasks. Additionally, it can autonomously evaluate the robot's performance and iteratively optimize the reward functions based on the evaluation results, thereby achieving a completely human-free robot skill learning process. The main contributions of the paper include: 1. **Proposing the ARO framework**: A framework for autonomous robot skill learning supervised by large language models, capable of automatically generating appropriate reward function code from natural language instructions and training robots to perform corresponding tasks through reinforcement learning. 2. **Automated evaluation and optimization**: By constructing an Evaluation Function Generation module (EFG), a Performance Evaluation module (PE), and an Environment Evaluation module (EE), the framework achieves automated evaluation and iterative optimization of robot performance. 3. **Experimental validation**: A series of experiments validated the effectiveness of the ARO framework, including basic experiments and experiments with randomly generated tasks, demonstrating the ability to control robots to complete various tasks in different environments. The paper also discusses some limitations of the method, such as the need for improved environmental understanding, task comprehension, and optimization stability. Future work will focus on further enhancing the system's generalization ability and the complexity of skills.

ARO: Large Language Model Supervised Robotics Text2Skill Autonomous Learning

Decision-Making in Robotic Grasping with Large Language Models.

Learning Reward for Robot Skills Using Large Language Models via Self-Alignment

Language to Rewards for Robotic Skill Synthesis

LaMI: Large Language Models for Multi-Modal Human-Robot Interaction

Interactive Robot Learning from Verbal Correction

Grounding Language with Visual Affordances over Unstructured Data

Continual Skill and Task Learning via Dialogue

Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models

Enhancing Robot Task Planning and Execution through Multi-Layer Large Language Models

Enhancing Human-Robot Collaborative Assembly in Manufacturing Systems Using Large Language Models

Grounding Language Models in Autonomous Loco-manipulation Tasks

Understanding Large-Language Model (LLM)-powered Human-Robot Interaction

RoboCoder: Robotic Learning from Basic Skills to General Tasks with Large Language Models

Automatic Robotic Development through Collaborative Framework by Large Language Models

A Smart Interactive Camera Robot Based on Large Language Models

Large Language Models as Zero-Shot Human Models for Human-Robot Interaction

Large Language Models for Robotics: A Survey

Leveraging Large Language Models for Comprehensive Locomotion Control in Humanoid Robots Design

Agentic Skill Discovery