RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images

Zong-Wei Hong,Yen-Yang Hung,Chu-Song Chen
2024-05-14
Abstract:In this work, we introduce a novel method for calculating the 6DoF pose of an object using a single RGB-D image. Unlike existing methods that either directly predict objects' poses or rely on sparse keypoints for pose recovery, our approach addresses this challenging task using dense correspondence, i.e., we regress the object coordinates for each visible pixel. Our method leverages existing object detection methods. We incorporate a re-projection mechanism to adjust the camera's intrinsic matrix to accommodate cropping in RGB-D images. Moreover, we transform the 3D object coordinates into a residual representation, which can effectively reduce the output space and yield superior performance. We conducted extensive experiments to validate the efficacy of our approach for 6D pose estimation. Our approach outperforms most previous methods, especially in occlusion scenarios, and demonstrates notable improvements over the state-of-the-art methods. Our code is available on
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
This paper aims to address the problem of 6 degrees of freedom (6DoF) object pose estimation based on RGB-D images. Specifically, the paper proposes a novel method—Residual Dense Point Network (RDPN6D), which utilizes a single RGB-D image to compute the object's pose. Unlike existing methods that directly predict object pose or rely on sparse keypoints for pose recovery, this method adopts a dense correspondence strategy, i.e., regressing the object coordinates for each visible pixel. Additionally, this method incorporates a re-projection mechanism to adjust the camera intrinsic matrix in the RGB-D image and converts the 3D object coordinates into residual representations, thereby effectively reducing the output space and improving performance. Experimental results show that this method outperforms most existing methods on multiple benchmark datasets, especially demonstrating significant improvements in occluded scenes. Overall, RDPN6D aims to overcome the shortcomings of existing methods in handling texture-less objects and severe occlusions through a dense correspondence strategy, thereby achieving more accurate and robust 6DoF object pose estimation.