GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence

Van Nguyen Nguyen,Thibault Groueix,Mathieu Salzmann,Vincent Lepetit

2024-03-15

Abstract:We present GigaPose, a fast, robust, and accurate method for CAD-based novel object pose estimation in RGB images. GigaPose first leverages discriminative "templates", rendered images of the CAD models, to recover the out-of-plane rotation and then uses patch correspondences to estimate the four remaining parameters. Our approach samples templates in only a two-degrees-of-freedom space instead of the usual three and matches the input image to the templates using fast nearest-neighbor search in feature space, results in a speedup factor of 35x compared to the state of the art. Moreover, GigaPose is significantly more robust to segmentation errors. Our extensive evaluation on the seven core datasets of the BOP challenge demonstrates that it achieves state-of-the-art accuracy and can be seamlessly integrated with existing refinement methods. Additionally, we show the potential of GigaPose with 3D models predicted by recent work on 3D reconstruction from a single image, relaxing the need for CAD models and making 6D pose object estimation much more convenient. Our source code and trained models are publicly available at <a class="link-external link-https" href="https://github.com/nv-nguyen/gigaPose" rel="external noopener nofollow">this https URL</a>

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper proposes a method called **GigaPose**, which aims to address two main issues in novel object pose estimation (6D pose estimation): 1. **Low inference speed**: Existing coarse pose estimation methods rely on template matching, resulting in slow processing speeds. For example, MegaPose requires over 1.6 seconds to process each detected object. 2. **Sensitivity to segmentation errors**: Existing template matching methods perform poorly when dealing with segmentation errors caused by occlusion. Specifically, the GigaPose method addresses these issues in the following ways: - Using local feature matching templates, achieving a 35-fold speed improvement in template search. - In the coarse pose estimation stage, estimating the remaining four degrees of freedom (i.e., in-plane rotation and translation) through a single 2D-2D correspondence, improving robustness to segmentation errors. Experimental results show that GigaPose achieves significant performance improvements on seven core datasets of the BOP challenge and can be seamlessly integrated into existing refinement methods to achieve higher accuracy and faster speeds. Additionally, GigaPose can utilize 3D models predicted from a single image for pose estimation, reducing the need for precise CAD models and making 6D pose estimation more convenient.

GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence

GS-Pose: Generalizable Segmentation-based 6D Object Pose Estimation with 3D Gaussian Splatting

Zero-Shot 3d Pose Estimation of Unseen Object by Two-Step Rgb-D Fusion

Object Pose Estimation Based on Multi-precision Vectors and Seg-Driven PnP

KGNet: Knowledge-Guided Networks for Category-Level 6D Object Pose and Size Estimation.

OnePose: One-Shot Object Pose Estimation Without CAD Models

GS2Pose: Two-stage 6D Object Pose Estimation Guided by Gaussian Splatting

GPV-Pose: Category-level Object Pose Estimation Via Geometry-guided Point-wise Voting

GeoPose: Dense Reconstruction Guided 6D Object Pose Estimation with Geometric Consistency

GPT-COPE: A Graph-Guided Point Transformer for Category-Level Object Pose Estimation

OCSKB: An Object Component Sketch Knowledge Base for Fast 6D Pose Estimation

Real-Time and Efficient 6-D Pose Estimation from a Single RGB Image

BOP Challenge 2022 on Detection, Segmentation and Pose Estimation of Specific Rigid Objects

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation

GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence

GDE-Pose: A Real-Time Adaptive Compression and Multi-Scale Dynamic Feature Fusion Approach for Pose Estimation

FoundPose: Unseen Object Pose Estimation with Foundation Features

CVAM-Pose: Conditional Variational Autoencoder for Multi-Object Monocular Pose Estimation

CNOS: A Strong Baseline for CAD-based Novel Object Segmentation

BiCo-Net: Regress Globally, Match Locally for Robust 6D Pose Estimation

ZeroPose: CAD-Prompted Zero-shot Object 6D Pose Estimation in Cluttered Scenes