Abstract:Continual and interactive robot learning is a challenging problem as the robot is present with human users who expect the robot to learn novel skills to solve novel tasks perpetually with sample efficiency. In this work we present a framework for robots to query and learn visuo-motor robot skills and task relevant information via natural language dialog interactions with human users. Previous approaches either focus on improving the performance of instruction following agents, or passively learn novel skills or concepts. Instead, we used dialog combined with a language-skill grounding embedding to query or confirm skills and/or tasks requested by a user. To achieve this goal, we developed and integrated three different components for our agent. Firstly, we propose a novel visual-motor control policy ACT with Low Rank Adaptation (ACT-LoRA), which enables the existing SoTA ACT model to perform few-shot continual learning. Secondly, we develop an alignment model that projects demonstrations across skill embodiments into a shared embedding allowing us to know when to ask questions and/or demonstrations from users. Finally, we integrated an existing LLM to interact with a human user to perform grounded interactive continual skill learning to solve a task. Our ACT-LoRA model learns novel fine-tuned skills with a 100% accuracy when trained with only five demonstrations for a novel skill while still maintaining a 74.75% accuracy on pre-trained skills in the RLBench dataset where other models fall significantly short. We also performed a human-subjects study with 8 subjects to demonstrate the continual learning capabilities of our combined framework. We achieve a success rate of 75% in the task of sandwich making with the real robot learning from participant data demonstrating that robots can learn novel skills or task knowledge from dialogue with non-expert users using our approach.

Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models

Grounding Language for Robotic Manipulation via Skill Library

Lifelong Robot Learning with Human Assisted Language Planners

LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning

CLFR-M: Continual Learning Framework for Robots Via Human Feedback and Dynamic Memory

Agentic Skill Discovery

Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks

Grounding Language Models in Autonomous Loco-manipulation Tasks

Grounding Language with Visual Affordances over Unstructured Data

Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models

From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control

Language, Camera, Autonomy! Prompt-engineered Robot Control for Rapidly Evolving Deployment

Large Language Models for Orchestrating Bimanual Robots

Interactive Robot Learning from Verbal Correction

LLM as A Robotic Brain: Unifying Egocentric Memory and Control

Game On: Towards Language Models as RL Experimenters

Continual Skill and Task Learning via Dialogue

Leveraging Large Language Models for Comprehensive Locomotion Control in Humanoid Robots Design

CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language Models

Incremental Learning of Humanoid Robot Behavior from Natural Interaction and Large Language Models

Autonomous Behavior Planning For Humanoid Loco-manipulation Through Grounded Language Model