6D Object Pose Estimation with Location-and-Channel Attention

Guoyu Zuo,Kexin Chen,Gao Huang
DOI: https://doi.org/10.2139/ssrn.3979336
2021-01-01
SSRN Electronic Journal
Abstract:The RGB-D images are widely used in 6D object pose estimation, but there are still great challenges in fully leveraging the depth data sources. Most of the existing CNN-based 6D pose estimation methods use input fusion or result fusion in their estimation frameworks to fuse RGB and depth information. And some other methods use the attention model to implement an optimal combination of the RGB-D data, focusing on the visual tasks like saliency detection and fine-grained categorization. In this paper, we introduce an end-to-end 6D object pose estimation method with a location-and-channel attention (LCA) mechanism to estimate the 6D pose of known objects using the RGB-D images, in which LCA is used to model the feature correlations within and between the RGB and depth information without a large amount of computational cost. Experiments were performed both on the common LINEMOD dataset and on a self-built domain randomization dataset with a vast amount of synthetic data. Results show that our method can achieve the similar performance of the state-of-the-art approaches on the LINEMOD dataset. Furthermore, the model trained on the domain randomization dataset can be deployed to the 6D object pose estimation in the real physical scenes.
What problem does this paper attempt to address?