Dexterous Functional Grasping

Ananye Agarwal,Shagun Uppal,Kenneth Shaw,Deepak Pathak
2023-12-06
Abstract:While there have been significant strides in dexterous manipulation, most of it is limited to benchmark tasks like in-hand reorientation which are of limited utility in the real world. The main benefit of dexterous hands over two-fingered ones is their ability to pickup tools and other objects (including thin ones) and grasp them firmly to apply force. However, this task requires both a complex understanding of functional affordances as well as precise low-level control. While prior work obtains affordances from human data this approach doesn't scale to low-level control. Similarly, simulation training cannot give the robot an understanding of real-world semantics. In this paper, we aim to combine the best of both worlds to accomplish functional grasping for in-the-wild objects. We use a modular approach. First, affordances are obtained by matching corresponding regions of different objects and then a low-level policy trained in sim is run to grasp it. We propose a novel application of eigengrasps to reduce the search space of RL using a small amount of human data and find that it leads to more stable and physically realistic motion. We find that eigengrasp action space beats baselines in simulation and outperforms hardcoded grasping in real and matches or outperforms a trained human teleoperator. Results visualizations and videos at <a class="link-external link-https" href="https://dexfunc.github.io/" rel="external noopener nofollow">this https URL</a>
Robotics,Artificial Intelligence,Computer Vision and Pattern Recognition,Machine Learning,Systems and Control
What problem does this paper attempt to address?
The problem this paper attempts to solve is how to achieve functional grasping of complex everyday objects, particularly using low-cost dexterous hands (such as the LEAP hand) to accomplish this task. Existing robotic learning research mostly relies on two-finger grippers or suction cups, which have limitations when grasping tools and other objects that require fine manipulation. Functional grasping not only requires the robot to recognize and locate objects but also to understand the functional areas of the objects and to perform stable grasping actions to complete subsequent tasks, such as hammering, drilling, etc. The main contribution of the paper is the proposal of a modular approach that combines the advantages of internet data and large-scale simulation training to achieve this goal. Specifically, the method is divided into three stages: 1. **Pre-grasp stage**: Predicting the functional grasp points of objects through a one-shot learning affordance model. This model uses DINOv2 feature matching to find corresponding regions between different objects, thereby inferring the correct grasping positions. 2. **Grasp stage**: Executing the grasping action using strategies trained in a simulated environment. To overcome the challenges brought by the high-dimensional action space, the paper introduces the concept of eigengrasps, reducing the action space from 16 dimensions to 9 dimensions, making the training more stable and physically reasonable. 3. **Post-grasp stage**: Once the object is stably grasped, a 6-DOF robotic arm can be used to move it to any position in space to complete specific tasks. Through this method, the paper demonstrates how to achieve functional grasping of various complex objects in the real world, including hammers, electric drills, frying pans, staplers, and screwdrivers, even if these objects did not appear during the training process. This marks significant progress in the functional grasping capabilities of dexterous hands.