Shoujie Li,Haixin Yu,Wenbo Ding,Houde Liu,Linqi Ye,Chongkun Xia,Xueqian Wang,Xiao-Ping Zhang
Abstract:The accurate detection and grasping of transparent objects are challenging but of significance to robots. Here, a visual-tactile fusion framework for transparent object grasping under complex backgrounds and variant light conditions is proposed, including the grasping position detection, tactile calibration, and visual-tactile fusion based classification. First, a multi-scene synthetic grasping dataset generation method with a Gaussian distribution based data annotation is proposed. Besides, a novel grasping network named TGCNN is proposed for grasping position detection, showing good results in both synthetic and real scenes. In tactile calibration, inspired by human grasping, a fully convolutional network based tactile feature extraction method and a central location based adaptive grasping strategy are designed, improving the success rate by 36.7% compared to direct grasping. Furthermore, a visual-tactile fusion method is proposed for transparent objects classification, which improves the classification accuracy by 34%. The proposed framework synergizes the advantages of vision and touch, and greatly improves the grasping efficiency of transparent objects.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the grasping of transparent objects in complex backgrounds. Specifically, due to the fact that the appearance of transparent objects can change significantly in different backgrounds, traditional visual detection methods are prone to failure. Therefore, how to achieve accurate and robust detection of transparent objects for efficient grasping has attracted great interest in the field of robotics. However, existing methods usually focus on the detection of transparent objects and assume that these objects are placed on static backgrounds with simple patterns, which is not always true in practical applications. Therefore, it is of great significance to develop a method for grasping transparent objects that can adapt to various backgrounds (such as soft or fluid surfaces, scenes with complex patterns or unpredictable conditions, for example, undulating scenes, underwater, etc.).
To meet this challenge, the author proposes a framework based on the fusion of vision and touch, which takes advantage of vision and touch and greatly improves the grasping efficiency of transparent objects in complex backgrounds. The main contributions of this framework include:
1. **Dataset Generation and Annotation**: A multi - scene synthetic grasping dataset named SimTrans12K is proposed, which contains different styles of backgrounds, illuminations and camera positions, and has more complex and rich background information than previous transparent object datasets (such as ClearGrasp and Dex - Nerf). In addition, a transparent object grasping position annotation method based on Gaussian distribution (Gaussian - Mask) is proposed, which can better represent the position information of transparent objects.
2. **Grasping Network**: For TaTa grippers, a generative grasping network named Transparent Object Grasping Convolutional Neural Network (TGCNN) is designed, which can detect the grasping positions of transparent objects only through training on the synthetic dataset under complex background and illumination conditions. At the same time, a tactile information extraction algorithm and a transparent object classification algorithm based on vision - touch fusion are developed to compensate for visual deviations.
3. **Vision - Touch Fusion Grasping Framework**: A transparent object grasping framework based on vision - touch fusion is proposed, including tactile calibration, Tactile Height Sensing (THS) module and Tactile Position Exploration (TPE) module, which can realize the grasping of transparent objects in scenes of stacking, overlapping or even undetectable by vision.
4. **Experimental Verification**: Multiple experiments are designed to widely compare the performance of the proposed method with several existing baseline methods. The results show that the proposed method has a significant performance improvement in transparent object grasping and classification. In addition, the proposed method is tested in some extremely challenging scenes (such as stacking, overlapping, undulating and dynamic underwater environments), which greatly expands the application range of transparent object grasping.
Through the above methods, the paper effectively solves the problem of grasping transparent objects in complex backgrounds and provides new ideas and solutions for the development of robotics.