Abstract:Six-degree-of-freedom (6DoF) object pose estimation is a crucial task for virtual reality and accurate robotic manipulation. Category-level 6DoF pose estimation has recently become popular as it improves generalization to a complete category of objects. However, current methods focus on data-driven differential learning, which makes them highly dependent on the quality of the real-world labeled data and limits their ability to generalize to unseen objects. To address this problem, we propose multi-hypothesis (MH) consistency learning (MH6D) for category-level 6-D object pose estimation without using real-world training data. MH6D uses a parallel consistency learning structure, alleviating the uncertainty problem of single-shot feature extraction and promoting self-adaptation of domain to reduce the synthetic-to-real domain gap. Specifically, three randomly sampled pose transformations are first performed in parallel on the input point cloud. An attention-guided category-level 6-D pose estimation network with channel attention (CA) and global feature cross-attention (GFCA) modules is then proposed to estimate the three hypothesized 6-D object poses by extracting and fusing the global and local features effectively. Finally, we propose a novel loss function that considers both the process and the final result information allowing MH6D to perform robust consistency learning. We conduct experiments under two different training data settings (i.e., only synthetic data and synthetic and real-world data) to verify the generalization ability of MH6D. Extensive experiments on benchmark datasets demonstrate that MH6D achieves state-of-the-art (SOTA) performance, outperforming most data-driven methods even without using any real-world data. The code is available at https://github.com/CNJianLiu/MH6D.

Self-Supervised Category-Level 6D Object Pose Estimation With Optical Flow Consistency

Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity.

Pseudo Flow Consistency for Self-Supervised 6D Object Pose Estimation

Learning Stereopsis from Geometric Synthesis for 6D Object Pose Estimation

Self-Supervised Geometric Correspondence for Category-Level 6D Object Pose Estimation in the Wild

Self-supervised Visual-LiDAR Odometry with Flip Consistency

Learning By Analogy: Reliable Supervision From Transformations For Unsupervised Optical Flow Estimation

Occlusion-Aware Self-Supervised Monocular 6D Object Pose Estimation.

STFlow: Self-Taught Optical Flow Estimation Using Pseudo Labels

Unsupervised Learning of Category-Level 3D Pose from Object-Centric Videos

Survey on Unsupervised Learning Methods for Optical Flow Estimation

Unsupervised Learning Optical Flow in Multi-frame Dynamic Environment Using Temporal Dynamic Modeling

Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation

Self-supervised Learning of Monocular 3D Geometry Understanding with Two- and Three-View Geometric Constraints

Shape-Constraint Recurrent Flow for 6D Object Pose Estimation

Unsupervised Deep Learning for Optical Flow Estimation.

CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular Images With Self-Supervised Learning

MH6D: Multi-Hypothesis Consistency Learning for Category-Level 6-D Object Pose Estimation

Self-Supervised Monocular Scene Flow Estimation

Joint Self-supervised Depth and Optical Flow Estimation towards Dynamic Objects

Self-Supervised 3D Scene Flow Estimation and Motion Prediction using Local Rigidity Prior