6-DoF grasp estimation method that fuses RGB-D data based on external attention
Haosong Ran,Qinshu Chen,Yifei Li,Yazhe Luo,Xiaoyu Zhang,Jiting Li,DianSheng Chen,XiaoChuan Zhang
DOI: https://doi.org/10.1016/j.jvcir.2024.104173
IF: 2.887
2024-05-16
Journal of Visual Communication and Image Representation
Abstract:6-DoF grasp estimation methods based on point clouds have long been a challenge in robotics due to the limitations of single data input, which hinder the robot's perception of real-world scenarios, thus reducing its robustness. In this work, we propose a 6-DoF grasp pose estimation method based on RGB-D data, which leverages ResNet to extract color image features, utilizes the PointNet++ network to extract geometric information features, and employs an external attention mechanism to fuse both features. Our method is an end-to-end design, and we validate its performance through benchmark tests on a large-scale dataset and evaluations in a simulated robot environment. Our method outperforms previous state-of-the-art methods on public datasets, achieving 47.75mAP and 40.08mAP for seen and unseen objects, respectively. We also test our grasp pose estimation method on multiple objects in a simulated robot environment, demonstrating that our approach exhibits higher grasp accuracy and robustness than previous methods.
computer science, information systems, software engineering