Abstract:Camera relocalization is a challenging task to estimate camera pose within a known scene, with wide applications in the fields of Virtual Reality (VR), Augmented Reality (AR), robotics, and etc. Most existing learning-based methods invariably utilize all the information within an image for pose estimation. Although these methods have demonstrated leading pose accuracy in some cases, they are still far from being sufficient to handle the robustness under challenging viewpoints with less impacts on the localization accuracy for viewpoints that are easier to localize. In this paper, we propose a novel two-branch camera pose estimation framework: one branch utilizes keypoint-guided partial scene coordinate regression, while the other employs full scene coordinate regression to assess the credibility of image poses, thereby enabling more accurate camera localization. In particular, we devise a keypoint selection method predicated on matching rates which is designed to measure the matching quality between a 3D keypoint and 2D keypoints across views. With these selected 3D keypoints, we can generate 2D supervision mask with the ground-truth camera pose to supervise the keypoint prediction from the keypoint selection network. Meanwhile, we further refine the 2D supervision mask through the optimization with reprojection errors on the scene coordinate network, which estimates the scene coordinates for points within the scene that truly warrant attention, also enhances the localization performance. We also introduce a gated camera pose estimation strategy on the two-branch pose estimation framework, employing an updated keypoint selection network for images with higher credibility and a more robust network for difficult viewpoints. By adopting an effective curriculum learning scheme, we achieve higher accuracy within a training span of just 20 minutes. Our method's superior performance is validated through rigorous experimentation. The code is released at https://github.com/DUT-ICCD/KP-Guided-Reloc.

Brain Inspired Keypoint Matching for 3D Scene Reconstruction

LifelongGlue: Keypoint Matching for 3D Reconstruction with Continual Neural Networks

Fast and Lightweight Network Improves Serial Brain Section Stitching

Parallel K Nearest Neighbor Matching for 3D Reconstruction.

Key Point Detection in 3D Reconstruction Based on Human-Computer Interaction

Exploring Matching Rates: from Keypoint Selection to Camera Relocalization

Unsupervised Learning of 3D Semantic Keypoints with Mutual Reconstruction.

Multi-Task Joint Learning of 3D Keypoint Saliency and Correspondence Estimation

A Feature Matching Method Based on the Convolutional Neural Network

Head Pose Estimation through Keypoints Matching between Reconstructed 3D Face Model and 2D Image

DDM-NET: End-to-end learning of keypoint feature Detection, Description and Matching for 3D localization

Improved Template Matching Based Stereo Vision Sparse 3D Reconstruction Algorithm

KeypointDETR: an End-to-End 3D Keypoint Detector

PointRecon: Online Point-based 3D Reconstruction via Ray-based 2D-3D Matching

A Point Matching Strategy of 3D Loss Function for Single RGB Images Deep Mesh Reconstruction

Motion-based 3D Reconstruction and Applications to Virtual Reality

A Semi-Supervised Method for PatchMatch Multi-View Stereo with Sparse Points

3D Reconstruction Approach Based on Neural Network

SNAKE: Shape-aware Neural 3D Keypoint Field

3D Reconstruction Based on Long Image Sequence

KdO-Net: Towards Improving the Efficiency of Deep Convolutional Neural Networks Applied in the 3D Pairwise Point Feature Matching