Abstract:Category-level pose estimation is a challenging problem due to intra-class shape variations. Recent methods deform pre-computed shape priors to map the observed point cloud into the normalized object coordinate space and then retrieve the pose via post-processing, i.e., Umeyama's Algorithm. The shortcomings of this two-stage strategy lie in two aspects: 1) The surrogate supervision on the intermediate results can not directly guide the learning of pose, resulting in large pose error after post-processing. 2) The inference speed is limited by the post-processing step. In this paper, to handle these shortcomings, we propose an end-to-end trainable network SSP-Pose for category-level pose estimation, which integrates shape priors into a direct pose regression network. SSP-Pose stacks four individual branches on a shared feature extractor, where two branches are designed to deform and match the prior model with the observed instance, and the other two branches are applied for directly regressing the totally 9 degrees-of-freedom pose and performing symmetry reconstruction and point-wise inlier mask prediction respectively. Consistency loss terms are then naturally exploited to align the outputs of different branches and promote the performance. During inference, only the direct pose regression branch is needed. In this manner, SSP-Pose not only learns category-level pose-sensitive characteristics to boost performance but also keeps a real-time inference speed. Moreover, we utilize the symmetry information of each category to guide the shape prior deformation, and propose a novel symmetry-aware loss to mitigate the matching ambiguity. Extensive experiments on public datasets demon-strate that SSP-Pose produces superior performance compared with competitors with a real-time inference speed at about 25Hz. The codes will be released soon.

DiffusionNOCS: Managing Symmetry and Uncertainty in Sim2Real Multi-Modal Category-level Pose Estimation

MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

GenPose: Generative Category-level Object Pose Estimation via Diffusion Models

DiffPose: Toward More Reliable 3D Pose Estimation

Learning a Category-level Object Pose Estimator without Pose Annotations

Di^2Pose: Discrete Diffusion Model for Occluded 3D Human Pose Estimation

Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation

ID-Pose: Sparse-view Camera Pose Estimation by Inverting Diffusion Models

Confronting Ambiguity in 6D Object Pose Estimation via Score-Based Diffusion on SE(3)

MH6D: Multi-Hypothesis Consistency Learning for Category-Level 6-D Object Pose Estimation

Sim2real transfer learning for 3D human pose estimation: motion to the rescue

GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence

DiffPose: Reliable 2D Pose Estimation Through Denoising Diffusion

SSP-Pose: Symmetry-Aware Shape Prior Deformation for Direct Category-Level Object Pose Estimation

Object Pose Estimation via the Aggregation of Diffusion Features

Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation

SE(3) Diffusion Model-based Point Cloud Registration for Robust 6D Object Pose Estimation

CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild

3D Human Pose Analysis via Diffusion Synthesis

Learning Geometric Consistency and Discrepancy for Category-Level 6D Object Pose Estimation from Point Clouds

6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation