Abstract:3D Object pose estimation is a critical task in many real-world applications, e.g., robotic manipulation and augmented reality. Most existing methods focus on estimating the object instances or categories which have been seen in the training phase. However, it is imperative to estimate the pose of unseen objects without re-training the network in real world. Therefore, we proposed a 3D pose estimation method for unseen objects without re-training. Specifically, given the CAD model of the unseen object, a set of template RGB-D images (RGB images and depth images) is rendered at different viewpoints. Then a feature embedding network, named PoseFusion, is designed to extract the scene feature. In this network, RGB-D images are utilized to extract the texture feature and geometric feature, respectively. Afterwards, a cross-modality alignment module is proposed to eliminate the noise in single modality. The aligned texture feature and aligned geometric feature are fused through a geometry guided fusion module. Thus, by PoseFusion, the template RGB-D images generated from the CAD model are abstracted into a set of template scene features, and the query scene features are also embedded from the captured RGB-D images from the unseen object. Finally, the query scene features are matched with the template scene features by calculating the masked local similarity. Then the identity and pose of unseen object are determined by the most similar template. Experiments on LINEMOD and T-LESS datasets demonstrate that our method outperforms other methods and generalizes better to unseen objects. Extensive ablation studies are performed to verify the effectiveness of the PoseFusion.

Category-level Pose Estimation and Iterative Refinement for Monocular RGB-D Image

Recurrent Volume-based 3D Feature Fusion for Real-time Multi-view Object Pose Estimation

Recurrent Volume-Based 3-D Feature Fusion for Real-Time Multiview Object Pose Estimation.

3D Point-to-Keypoint Voting Network for 6D Pose Estimation

FDN: Feature Decoupling Network for Head Pose Estimation.

RGB-based Category-level Object Pose Estimation via Decoupled Metric Scale Recovery

Zero-Shot 3d Pose Estimation of Unseen Object by Two-Step Rgb-D Fusion

CATRE: Iterative Point Clouds Alignment for Category-Level Object Pose Refinement

Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation from Monocular RGB Image

Influence of autoregulation on renin release and sodium excretion.

RFFCE: Residual Feature Fusion and Confidence Evaluation Network for 6dof Pose Estimation.

KGNet: Knowledge-Guided Networks for Category-Level 6D Object Pose and Size Estimation.

Recent developments in diabetes research.

Category-Level Object Pose Estimation with Statistic Attention

Attention Guided 6D Object Pose Estimation with Multi-constraints Voting Network

GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence

DualPoseNet: Category-level 6D Object Pose and Size Estimation Using Dual Pose Network with Refined Learning of Pose Consistency

A Transformer-based multi-modal fusion network for 6D pose estimation

ACR-Pose: Adversarial Canonical Representation Reconstruction Network for Category Level 6D Object Pose Estimation

Robust Classification and 6D Pose Estimation by Sensor Dual Fusion of Image and Point Cloud Data