Pose-guided Auto-Encoder and Feature-Based Refinement for 6-Dof Object Pose Regression

Zhigang Li,Xiangyang Ji
DOI: https://doi.org/10.1109/icra40945.2020.9196953
2020-01-01
Abstract:Accurately estimating the 6-DoF object pose from a single RGB image is a challenging task in computer vision. Though pose regression approaches have achieved great progress, the performance is still limited. In this work, we propose Pose-guided Auto-Encoder (PAE), which can distill better pose-related features from the image by utilizing a suitable pose representation, 3D Location Field (3DLF), to guide the encoding process. The features from PAE show strong robustness to pose-irrelevant factors. Compared with traditional auto-encoder, PAE can not only improve the pose estimation performance but also handle the ambiguity viewpoints problem. Further, we propose Feature-based Pose Refiner (FPR), which refines the pose from the extracted features without rendering. Combining PAE with FPR, our approach achieved state-of-the-art performance on the widely used LINEMOD dataset. Our approach not only outperforms the direct regression-based approaches with a large margin but also thrillingly surpasses current state-of-the-art indirect PnP-based approach.
What problem does this paper attempt to address?