Abstract:Dexterous manipulation is a critical aspect of human capability, enabling interaction with a wide variety of objects. Recent advancements in learning from human demonstrations and teleoperation have enabled progress for robots in such ability. However, these approaches either require complex data collection such as costly human effort for eye-robot contact, or suffer from poor generalization when faced with novel scenarios. To solve both challenges, we propose a framework, DexH2R, that combines human hand motion retargeting with a task-oriented residual action policy, improving task performance by bridging the embodiment gap between human and robotic dexterous hands. Specifically, DexH2R learns the residual policy directly from retargeted primitive actions and task-oriented rewards, eliminating the need for labor-intensive teleoperation systems. Moreover, we incorporate test-time guidance for novel scenarios by taking in desired trajectories of human hands and objects, allowing the dexterous hand to acquire new skills with high generalizability. Extensive experiments in both simulation and real-world environments demonstrate the effectiveness of our work, outperforming prior state-of-the-arts by 40% across various settings.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to learn and transfer skills from human demonstrations in dexterous robot manipulation, while overcoming two major challenges in existing methods: 1. **Complex data collection**: Existing methods, such as training robots through human demonstrations or tele - operation, usually require a complex and labor - intensive data collection process. For example, a large amount of human participation is required to achieve eye - hand coordination. 2. **Poor generalization ability**: When facing new scenarios, these methods often perform poorly because they rely on specific human prior knowledge and are difficult to adapt to new environments or objects. To solve these problems, the paper proposes a framework - DexH2R (Dexterous Manipulation from Human to Robots), which combines human hand motion redirection and task - oriented residual motion policies, aiming to improve task performance by bridging the physical differences between human and robotic dexterous hands. Specifically, DexH2R directly learns residual policies from redirected basic motions and task - oriented rewards, eliminating the need for labor - intensive tele - operation systems. In addition, by introducing the desired human hand trajectories and object trajectories, DexH2R can provide guidance for new scenarios during testing, enabling the dexterous hand to acquire new skills with high generalization ability. The main contributions of the paper are: - **Improved generalization ability**: DexH2R can effectively avoid collisions and improve the success rate of task completion when facing new scenarios by introducing human hand trajectories. - **Reduced manpower requirements**: Compared with traditional tele - operation systems, DexH2R does not require real - time human intervention, significantly reducing manpower costs, ensuring smooth operation, and does not require expensive system support. - **Experimental verification**: Extensive experiments were carried out in simulation and real - world environments, and the results show that DexH2R outperforms existing methods in multiple settings, with a success rate increased by approximately 40%. Overall, DexH2R provides a comprehensive solution for dexterous robot manipulation. It can not only follow human hand motions to complete tasks, but also generalize to new environments by using human motion cues during inference.

DexH2R: Task-oriented Dexterous Manipulation from Human to Robots

DexRepNet: Learning Dexterous Robotic Grasping Network with Geometric and Spatial Hand-Object Representations

DexDeform: Dexterous Deformable Object Manipulation with Human Demonstrations and Differentiable Physics

Object-Centric Dexterous Manipulation from Human Motion Data

DexSkills: Skill Segmentation Using Haptic Data for Learning Autonomous Long-Horizon Robotic Manipulation Tasks

DexMV: Imitation Learning for Dexterous Manipulation from Human Videos

Bi-DexHands: Towards Human-Level Bimanual Dexterous Manipulation

DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation

Real-time Dexterous Telemanipulation with an End-Effect-Oriented Learning-based Approach

H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation

DeltaHands: A Synergistic Dexterous Hand Framework Based on Delta Robots

DexDiff: Towards Extrinsic Dexterity Manipulation of Ungraspable Objects in Unrestricted Environments

Learning Human-to-Robot Dexterous Handovers for Anthropomorphic Hand

DexCap: Scalable and Portable Mocap Data Collection System for Dexterous Manipulation

Holo-Dex: Teaching Dexterity with Immersive Mixed Reality

DexPilot: Vision-Based Teleoperation of Dexterous Robotic Hand-Arm System.

Tilde: Teleoperation for Dexterous In-Hand Manipulation Learning with a DeltaHand

RealDex: Towards Human-like Grasping for Robotic Dexterous Hand

DexSim2Real$^{2}$: Building Explicit World Model for Precise Articulated Object Dexterous Manipulation

DexTransfer: Real World Multi-fingered Dexterous Grasping with Minimal Human Demonstrations

Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance