Real-time 6D pose estimation from a single RGB image.

Xin Zhang,Zhiguo Jiang,Haopeng Zhang
DOI: https://doi.org/10.1016/j.imavis.2019.06.013
IF: 3.86
2019-01-01
Image and Vision Computing
Abstract:We propose an end-to-end deep learning architecture for simultaneously detecting objects and recovering 6D poses in an RGB image. Concretely, we extend the 2D detection pipeline with a pose estimation module to indirectly regress the image coordinates of the object's 3D vertices based on 2D detection results. Then the object's 6D pose can be estimated using a Perspective-n-Point algorithm without any post-refinements. Moreover, we elaborately design a backbone structure to maintain spatial resolution of low level features for pose estimation task. Compared with state-of-the-art RGB based pose estimation methods, our approach achieves competitive or superior performance on two benchmark datasets at an inference speed of 25 fps on a GTX 1080Ti GPU, which is capable of real-time processing.
What problem does this paper attempt to address?