Abstract:Human hands possess the dexterity to interact with diverse objects such as grasping specific parts of the objects and/or approaching them from desired directions. More importantly, humans can grasp objects of any shape without object-specific skills. Recent works synthesize grasping motions following single objectives such as a desired approach heading direction or a grasping area. Moreover, they usually rely on expensive 3D hand-object data during training and inference, which limits their capability to synthesize grasping motions for unseen objects at scale. In this paper, we unify the generation of hand-object grasping motions across multiple motion objectives, diverse object shapes and dexterous hand morphologies in a policy learning framework GraspXL. The objectives are composed of the graspable area, heading direction during approach, wrist rotation, and hand position. Without requiring any 3D hand-object interaction data, our policy trained with 58 objects can robustly synthesize diverse grasping motions for more than 500k unseen objects with a success rate of 82.2%. At the same time, the policy adheres to objectives, which enables the generation of diverse grasps per object. Moreover, we show that our framework can be deployed to different dexterous hands and work with reconstructed or generated objects. We quantitatively and qualitatively evaluate our method to show the efficacy of our approach. Our model, code, and the large-scale generated motions are available at <a class="link-external link-https" href="https://eth-ait.github.io/graspxl/" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

### Main Problem Addressed by the Paper The primary goal of this paper is to develop a method capable of generating large-scale hand grasping actions that can adapt to various object shapes, hand morphologies, and multiple grasping objectives. Specifically, the researchers propose the GraspXL framework, which can generate hand grasping actions for a vast number of unseen objects (over 500,000) without relying on specific hand-object interaction datasets. ### Key Points of the Solution 1. **Multi-Objective Grasp Synthesis**: The GraspXL framework can comprehensively consider multiple grasping objectives, including graspable regions, orientation directions, wrist rotation angles, and hand positions. 2. **Wide Applicability**: This method is not only applicable to different hand models but can also handle reconstructed or generated objects and has been validated on various robotic hands. 3. **No Need for 3D Hand-Object Data**: GraspXL does not require any 3D hand-object interaction data for training, significantly enhancing its generalization capability to unseen objects. 4. **Efficient Training Strategy**: By introducing a curriculum, the learning process is divided into goal learning and grasp learning stages to overcome the difficulties of simultaneously satisfying multiple objectives. 5. **Control-Oriented Guidance Mechanism**: To improve exploration efficiency and control precision, GraspXL uses a simple yet effective mechanism to directly guide the hand towards the target. ### Main Contributions 1. Proposed the GraspXL framework, capable of generating grasping actions for over 500,000 unseen objects without hand-object interaction data. 2. Designed a curriculum and control-oriented guidance mechanism, enabling the method to achieve stable grasps while satisfying multiple objectives. 3. Created a dataset containing grasping actions for over 500,000 different objects interacting with various hand models. 4. Demonstrated that the method is applicable not only to reconstructed or generated objects but also to different dexterous hand models, such as the Shadow Hand, Allegro Hand, and Faive Hand. In summary, GraspXL aims to address the limitations of existing grasp synthesis methods, particularly in terms of generalization to unseen objects and support for multiple grasping objectives. Through the aforementioned innovations, GraspXL provides a more flexible and powerful solution.

GraspXL: Generating Grasping Motions for Diverse Objects at Scale

D-Grasp: Physically Plausible Dynamic Grasp Synthesis for Hand-Object Interactions

SynH2R: Synthesizing Hand-Object Motions for Learning Human-To-Robot Handovers

Learning Robust Real-World Dexterous Grasping Policies via Implicit Shape Augmentation

FunGrasp: Functional Grasping for Diverse Dexterous Hands

GraspGF: Learning Score-based Grasping Primitive for Human-assisting Dexterous Grasping

3D Whole-body Grasp Synthesis with Directional Controllability

UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy

Plausible and Diverse Human Hand Grasping Motion Generation

Generalized Anthropomorphic Functional Grasping with Minimal Demonstrations

PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models

Contact2Grasp: 3D Grasp Synthesis via Hand-Object Contact Constraint

DexGraspNet: A Large-Scale Robotic Dexterous Grasp Dataset for General Objects Based on Simulation

EfficientGrasp: A Unified Data-Efficient Learning to Grasp Method for Multi-Fingered Robot Hands

DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes

Toward Human-Like Grasp: Dexterous Grasping via Semantic Representation of Object-Hand

FastGrasp: Efficient Grasp Synthesis with Diffusion

D(R, O) Grasp: A Unified Representation of Robot and Object Interaction for Cross-Embodiment Dexterous Grasping

Task-Oriented Dexterous Hand Pose Synthesis Using Differentiable Grasp Wrench Boundary Estimator

GenDexGrasp: Generalizable Dexterous Grasping

DexGrasp-Diffusion: Diffusion-based Unified Functional Grasp Synthesis Pipeline for Multi-Dexterous Robotic Hands