Abstract:6-D pose estimation is an important branch in the field of vision measurement and is widely used in the fields of robotics, autonomous driving, and reality augmentation. The latest research trend in 6-D pose estimation is to train a deep neural network to directly predict the 2-D projection position of the 3-D keypoint from the image, establish the corresponding relationship, and, finally, use the perspective-n-point (PnP) algorithm to perform pose estimation. The current challenge of pose estimation is that, when objects are textureless, occluded, or scene-cluttered, the detection accuracy is reduced, and most of the existing algorithm models are large and cannot accommodate real-time requirements. In this article, we introduce a densely connected feature pyramid network (DFPN) that can efficiently integrate and utilize features. We combine the cross-stage partial network (CSPNet) with DFPN to design a new network for 6-D pose estimation, DFPN-6-D, a new approach for 6-D object pose estimation. DFPN-6-D can efficiently and accurately handle objects with textureless, occluded, and scene clutter and estimate their full 6-D poses in a single shot. Furthermore, we propose a new confidence calculation method and loss function for object pose estimation, which can fully consider spatial information. Finally, we propose a novel augmentation method for direct 6-D pose estimation approaches to improve performance and generalization ability in the case of occlusion, which is called 6-D augmentation. Our approach achieves a new state-of-the-art accuracy of 98.06 and 87.09 in terms of the ADD(-S) metric on the Linemod dataset and the Occluded-Linemod dataset, and our method also achieves the best result in terms of the different metric on the MULT-I dataset, the BIN-P dataset, and the T-LESS dataset, respectively, while still running end-to-end at over 65 frames/s. The experimental results demonstrate that our algorithm is robust to textureless materials and occlusion while running more efficiently than other methods. We also deploy our proposed method to a real robot to grasp and manipulate objects based on the estimated pose.

An efficient lightweight deep neural network for real-time object 6D pose estimation with RGB-D inputs

Recurrent Volume-based 3D Feature Fusion for Real-time Multi-view Object Pose Estimation

Real-Time and Efficient 6-D Pose Estimation from a Single RGB Image

3D Point-to-Keypoint Voting Network for 6D Pose Estimation

An Iterative Attention Fusion Network for 6D Object Pose Estimation

DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion

Efficient 6D Object Pose Estimation Based on Attentive Multi‐scale Contextual Information

ACF-Net: Attention Context Fusion Network for 6D Pose Estimation

RFF-PoseNet: A 6D Object Pose Estimation Network Based on Robust Feature Fusion in Complex Scenes

Depth-Based Lightweight Feature Fusion Network for Category-Level 6D Pose Estimation

RFFCE: Residual Feature Fusion and Confidence Evaluation Network for 6dof Pose Estimation.

A Lightweight Two-End Feature Fusion Network for Object 6D Pose Estimation

A Novel 6D Pose Estimation Method for Indoor Objects Based on Monocular Regression Depth

MFPN-6D: Real-time One-stage Pose Estimation of Objects on RGB Images

6D Object Pose Estimation Based on Cross-Modality Feature Fusion

SaMfENet: Self-Attention Based Multi-Scale Feature Fusion Coding and Edge Information Constraint Network for 6D Pose Estimation

RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images

A Lightweight Color and Geometry Feature Extraction and Fusion Module for End-to-end 6D Pose Estimation

An Efficient Color and Geometric Feature Fusion Module for 6D Object Pose Estiamtion

Graph Neural Network for 6D Object Pose Estimation

6D Object Pose Estimation in Cluttered Scenes from RGB Images