Semi-Supervised 6D Object Pose Estimation Without Using Real Annotations

Guangliang Zhou,Deming Wang,Yi Yan,Huiyi Chen,Qijun Chen
DOI: https://doi.org/10.1109/tcsvt.2021.3138129
IF: 5.859
2022-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:6D object pose estimation is a longstanding computer vision problem. Existing deep learning-based methods have achieved inspiring results in this task. However, large-scale training data with annotations is extremely needed to guarantee these methods’ performance, and acquiring real 6D object pose annotations is fairly labor-intensive and time-consuming. To overcome this drawback, we propose a semi-supervised pose estimation method using labeled synthetic data and unlabeled real data. For unlabeled real data, we form a self-supervised pipeline by minimizing the distance between the input point cloud, which is under ground-truth pose, and the model points transformed based on predicted pose. The labeled synthetic data is used to supervise the network to converge correctly. And we utilize a feature mapping to eliminate the domain gap between the real and synthetic features to further enhance the network’s performance. Moreover, we propose an attention-based pose estimation network, which can concentrate more on the distinguishing features, thus improving the accuracy of pose estimation. Experiments show that our proposed semi-supervised method is able to achieve good performance without the real annotations and outperforms all other methods relying on synthetic data or self-supervision strategy, indicating that the proposed method is effective.
What problem does this paper attempt to address?