Abstract:Continual and interactive robot learning is a challenging problem as the robot is present with human users who expect the robot to learn novel skills to solve novel tasks perpetually with sample efficiency. In this work we present a framework for robots to query and learn visuo-motor robot skills and task relevant information via natural language dialog interactions with human users. Previous approaches either focus on improving the performance of instruction following agents, or passively learn novel skills or concepts. Instead, we used dialog combined with a language-skill grounding embedding to query or confirm skills and/or tasks requested by a user. To achieve this goal, we developed and integrated three different components for our agent. Firstly, we propose a novel visual-motor control policy ACT with Low Rank Adaptation (ACT-LoRA), which enables the existing SoTA ACT model to perform few-shot continual learning. Secondly, we develop an alignment model that projects demonstrations across skill embodiments into a shared embedding allowing us to know when to ask questions and/or demonstrations from users. Finally, we integrated an existing LLM to interact with a human user to perform grounded interactive continual skill learning to solve a task. Our ACT-LoRA model learns novel fine-tuned skills with a 100% accuracy when trained with only five demonstrations for a novel skill while still maintaining a 74.75% accuracy on pre-trained skills in the RLBench dataset where other models fall significantly short. We also performed a human-subjects study with 8 subjects to demonstrate the continual learning capabilities of our combined framework. We achieve a success rate of 75% in the task of sandwich making with the real robot learning from participant data demonstrating that robots can learn novel skills or task knowledge from dialogue with non-expert users using our approach.

Learning to Mediate Perceptual Differences in Situated Human-Robot Dialogue

Towards Mediating Shared Perceptual Basis in Situated Dialogue.

Collaborative Effort Towards Common Ground in Situated Human-Robot Dialogue.

Dialogue Learning with Human-in-the-Loop.

Learning through Dialogue Interactions by Asking Questions

Human-robot Negotiation of Intentions Based on Virtual Fixtures for Shared Task Execution

Learning to Mediate Disparities Towards Pragmatic Communication

Multi-robot behavior adaptation to local and global communication atmosphere in humans-robots interaction

Human-Robot Dialogue Annotation for Multi-Modal Common Ground

Ambiguities in Spatial Language Understanding in Situated Human Robot Dialogue.

Collaborative Language Grounding Toward Situated Human‐Robot Dialogue

Task Learning Through Visual Demonstration and Situated Dialogue.

Continual Skill and Task Learning via Dialogue

Aligning Learning with Communication in Shared Autonomy

Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human-Robot Interaction

Exploring the Design of Robot Mediation with Bodily Contact for Remote Conflict

Resolving Positional Ambiguity in Dialogues by Vision-Language Models for Robot Navigation

Learning to Communicate Functional States with Nonverbal Expressions for Improved Human-Robot Collaboration

An emotion-driven and topic-aware dialogue framework for human–robot interaction

SYNERGAI: Perception Alignment for Human-Robot Collaboration

Towards Visual Dialogue for Human-Robot Interaction