Abstract:Visual localization plays a critical role in the functionality of low-cost autonomous mobile robots. Contemporary leading methods for precise visual localization are predominantly 3D scene-specific, necessitating extra computational and memory overhead to construct a 3D scene model in novel environments. An alternative approach of directly using a database of 2D images for visual localization offers more flexibility. However, such methods currently suffer from limited localization accuracy. In this paper, we propose an accurate and robust multiple checking-based 3D model-free visual localization system to address the aforementioned issues. To ensure high accuracy, our focus is on estimating the pose of a query image relative to the retrieved database images using 2D-2D feature matches. Theoretically, by incorporating the local planar motion constraint into both the estimation of the essential matrix and the triangulation stages, we reduce the minimum required feature matches for absolute pose estimation, thereby enhancing the robustness of outlier rejection. Additionally, we introduce a multiple-checking mechanism to ensure the correctness of the solution throughout the solving process. The efficacy of our approach is substantiated through both qualitative and quantitative assessments on simulated and two real-world datasets evidencing significant improvements in accuracy and robustness provided by our 3D model-free visual localization system. Note to Practitioners-The motivation of this article stems from the need to develop an accurate visual localization system with simplicity and flexibility of map construction and easy adaption to new environments. Such a system holds great practical value for a range of applications, including warehouse robots, service robots, and countless others. Existing visual localization systems that achieve high accuracy are dependent on a pre-built accurate 3D scene map, which pose challenges in terms of map construction and consume significant storage resources onboard, particularly for large scenes. And the aforementioned efforts need to be repeated when changing to a new scene. In this article, an accurate and robust 3D model-free visual localization system is proposed to handle this problem. The map construction is simplified to build a set of database images with associated camera poses, which is trivial as it amounts to adding posed images to a database. The core idea for achieving high accuracy and robustness is to model the local planar motion characteristic of general ground-moving robots into both essential matrix estimation and triangulation stages to obtain two minimal solutions. The proposed localization system simplifies the task of switching between different application scenarios for the robot, reducing additional workload and lowering the difficulty of use.

An End-to-end Learning Framework for Visual Camera Relocalization Using RGB and RGB-D Images

Leveraging Local Planar Motion Property for Robust Visual Matching and Localization.

3D Model-free Visual Localization System from Essential Matrix under Local Planar Motion

A Depth Estimation Framework Based on Unsupervised Learning and Cross-Modal Translation

A Framework for Multi-Session RGBD SLAM in Low Dynamic Workspace Environment.

2-Entity Random Sample Consensus for Robust Visual Localization: Framework, Methods, and Verifications

Accurate Rgb Camera Relocalization Using Regression Forest

Local Supports Global: Deep Camera Relocalization With Sequence Enhancement

Regression Forest Based RGB-D Visual Relocalization Using Coarse-to-Fine Strategy

Unsupervised Simultaneous Learning for Camera Re-Localization and Depth Estimation from Video

Robust 3D Reconstruction with an RGB-D Camera

Self-Supervised Camera Relocalization with Hierarchical Fern Encoding

Implicit Learning of Scene Geometry From Poses for Global Localization

Local Optimized and Scalable Frame-to-model SLAM

6D Camera Relocalization in Visually Ambiguous Extreme Environments

Decoupling Features and Coordinates for Few-shot RGB Relocalization

Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer

Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual Localization

Geometrical Features based Visual Relocalization for Indoor Service Robot

Unsupervised Learning-based Depth Estimation aided Visual SLAM Approach