Abstract:Dexterous manipulation, often facilitated by multi-fingered robotic hands, holds solid impact for real-world applications. Soft robotic hands, due to their compliant nature, offer flexibility and adaptability during object grasping and manipulation. Yet, benefits come with challenges, particularly in the control development for finger coordination. Reinforcement Learning (RL) can be employed to train object-specific in-hand manipulation policies, but limiting adaptability and generalizability. We introduce a Continual Policy Distillation (CPD) framework to acquire a versatile controller for in-hand manipulation, to rotate different objects in shape and size within a four-fingered soft gripper. The framework leverages Policy Distillation (PD) to transfer knowledge from expert policies to a continually evolving student policy network. Exemplar-based rehearsal methods are then integrated to mitigate catastrophic forgetting and enhance generalization. The performance of the CPD framework over various replay strategies demonstrates its effectiveness in consolidating knowledge from multiple experts and achieving versatile and adaptive behaviours for in-hand manipulation tasks.

What problem does this paper attempt to address?

The paper attempts to address the problem of achieving flexible and versatile object manipulation capabilities in soft robotic hand operations. Specifically, the paper focuses on how to train hand operation strategies that can adapt to objects of different shapes and sizes through Reinforcement Learning (RL). However, these strategies often have limitations, especially when dealing with multiple specific objects, making it difficult to balance generality and adaptability. To tackle this challenge, the authors propose a Continual Policy Distillation (CPD) framework, which aims to extract knowledge from multiple expert policies and integrate it into a continuously evolving student policy network, thereby obtaining a general and flexible controller. Additionally, the CPD framework incorporates an example-based replay method to mitigate catastrophic forgetting and improve the model's generalization ability. The main contributions of the paper include: 1. **Proposing the CPD framework**: This framework transfers knowledge from multiple expert policies to a student policy network through policy distillation techniques, enabling it to continuously learn and improve without accessing pre-trained data. 2. **Mitigating catastrophic forgetting**: By integrating an example-based replay method, the CPD framework can retain previously learned knowledge while learning new tasks, thus avoiding catastrophic forgetting. 3. **Experimental validation**: A series of experiments were conducted on a four-finger soft robotic hand, validating the effectiveness and robustness of the CPD framework in handling rotation tasks with objects of different shapes and sizes. Overall, the paper aims to overcome the limitations of traditional reinforcement learning methods in soft robotic hand operations through the CPD framework, achieving more general and flexible control strategies.

Continual Policy Distillation of Reinforcement Learning-based Controllers for Soft Robotic In-Hand Manipulation

Ensemble Bootstrapped Deep Deterministic Policy Gradient For Vision-Based Robotic Grasping

Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost

Dexterous In-Hand Manipulation of Slender Cylindrical Objects through Deep Reinforcement Learning with Tactile Sensing

Learning Hierarchical Control for Robust In-Hand Manipulation

Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning

On Policy Learning Robust to Irreversible Events: An Application to Robotic In-Hand Manipulation

Getting the Ball Rolling: Learning a Dexterous Policy for a Biomimetic Tendon-Driven Hand with Rolling Contact Joints

Hierarchical Tactile-Based Control Decomposition of Dexterous In-Hand Manipulation Tasks

Open-Loop Motion Control of a Hydraulic Soft Robotic Arm Using Deep Reinforcement Learning

Robust and High-Precision End-to-End Control Policy for Multi-stage Manipulation Task with Behavioral Cloning.

Experience Consistency Distillation Continual Reinforcement Learning for Robotic Manipulation Tasks

Learning Playing Piano with Bionic-Constrained Diffusion Policy for Anthropomorphic Hand

Dextrous Tactile In-Hand Manipulation Using a Modular Reinforcement Learning Architecture

Cross-Embodiment Dexterous Grasping with Reinforcement Learning

Learning dexterous in-hand manipulation

Learning Deep Visuomotor Policies for Dexterous Hand Manipulation

Learning thin deformable object manipulation with a multi-sensory integrated soft hand

Object Manipulation with an Anthropomorphic Robotic Hand via Deep Reinforcement Learning with a Synergy Space of Natural Hand Poses

Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation

Estimator-Coupled Reinforcement Learning for Robust Purely Tactile In-Hand Manipulation