WS-OPE: Weakly Supervised 6D Object Pose Regression using Relative Multi-Camera Pose Constraints

Shaowu Yang,Slobodan Ilic,Fu Li,Benjamin Busam,I. Shugurov
DOI: https://doi.org/10.1109/LRA.2022.3146924
IF: 5.2
IEEE Robotics and Automation Letters
Abstract:Precise annotation of 6D poses in real data is intricate and time-consuming, however, an essential requirement to train pose estimation pipelines. We propose a way for scalable, end-to-end 6D pose regression with weak supervision to avoid this problem. Our method requires neither 3D models nor 6D object poses as ground truth. Instead, we use 2D bounding boxes and object sizes as the only labels and constrain the problem with multiple images of known relative poses during training. A novel Rotated-IoU loss brings together a pose prediction from an image with labeled 2D bounding boxes of the corresponding object in other views. Our rotation estimation combines an initial coarse pose classification with an offset regression using a continuous rotation parametrization that allows for direct pose estimation. At test time, the model still uses only a single image to predict a 6D pose. We observe that multi-view constraints and our rotation representation used during training lead to better learning of 6D pose embeddings in comparison to fully supervised methods. Experiments on several datasets show that the proposed method is capable of predicting poses of good quality, in spite being trained with only weak labels. Direct pose regression without the need for a consecutive refinement stage thereby ensures real-time performance.
Computer Science
What problem does this paper attempt to address?