GraspXL: Generating Grasping Motions for Diverse Objects at Scale

Hui Zhang,Sammy Christen,Zicong Fan,Otmar Hilliges,Jie Song
2024-07-13
Abstract:Human hands possess the dexterity to interact with diverse objects such as grasping specific parts of the objects and/or approaching them from desired directions. More importantly, humans can grasp objects of any shape without object-specific skills. Recent works synthesize grasping motions following single objectives such as a desired approach heading direction or a grasping area. Moreover, they usually rely on expensive 3D hand-object data during training and inference, which limits their capability to synthesize grasping motions for unseen objects at scale. In this paper, we unify the generation of hand-object grasping motions across multiple motion objectives, diverse object shapes and dexterous hand morphologies in a policy learning framework GraspXL. The objectives are composed of the graspable area, heading direction during approach, wrist rotation, and hand position. Without requiring any 3D hand-object interaction data, our policy trained with 58 objects can robustly synthesize diverse grasping motions for more than 500k unseen objects with a success rate of 82.2%. At the same time, the policy adheres to objectives, which enables the generation of diverse grasps per object. Moreover, we show that our framework can be deployed to different dexterous hands and work with reconstructed or generated objects. We quantitatively and qualitatively evaluate our method to show the efficacy of our approach. Our model, code, and the large-scale generated motions are available at <a class="link-external link-https" href="https://eth-ait.github.io/graspxl/" rel="external noopener nofollow">this https URL</a>.
Robotics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Main Problem Addressed by the Paper The primary goal of this paper is to develop a method capable of generating large-scale hand grasping actions that can adapt to various object shapes, hand morphologies, and multiple grasping objectives. Specifically, the researchers propose the GraspXL framework, which can generate hand grasping actions for a vast number of unseen objects (over 500,000) without relying on specific hand-object interaction datasets. ### Key Points of the Solution 1. **Multi-Objective Grasp Synthesis**: The GraspXL framework can comprehensively consider multiple grasping objectives, including graspable regions, orientation directions, wrist rotation angles, and hand positions. 2. **Wide Applicability**: This method is not only applicable to different hand models but can also handle reconstructed or generated objects and has been validated on various robotic hands. 3. **No Need for 3D Hand-Object Data**: GraspXL does not require any 3D hand-object interaction data for training, significantly enhancing its generalization capability to unseen objects. 4. **Efficient Training Strategy**: By introducing a curriculum, the learning process is divided into goal learning and grasp learning stages to overcome the difficulties of simultaneously satisfying multiple objectives. 5. **Control-Oriented Guidance Mechanism**: To improve exploration efficiency and control precision, GraspXL uses a simple yet effective mechanism to directly guide the hand towards the target. ### Main Contributions 1. Proposed the GraspXL framework, capable of generating grasping actions for over 500,000 unseen objects without hand-object interaction data. 2. Designed a curriculum and control-oriented guidance mechanism, enabling the method to achieve stable grasps while satisfying multiple objectives. 3. Created a dataset containing grasping actions for over 500,000 different objects interacting with various hand models. 4. Demonstrated that the method is applicable not only to reconstructed or generated objects but also to different dexterous hand models, such as the Shadow Hand, Allegro Hand, and Faive Hand. In summary, GraspXL aims to address the limitations of existing grasp synthesis methods, particularly in terms of generalization to unseen objects and support for multiple grasping objectives. Through the aforementioned innovations, GraspXL provides a more flexible and powerful solution.