Abstract:Visual localization plays a critical role in the functionality of low-cost autonomous mobile robots. Contemporary leading methods for precise visual localization are predominantly 3D scene-specific, necessitating extra computational and memory overhead to construct a 3D scene model in novel environments. An alternative approach of directly using a database of 2D images for visual localization offers more flexibility. However, such methods currently suffer from limited localization accuracy. In this paper, we propose an accurate and robust multiple checking-based 3D model-free visual localization system to address the aforementioned issues. To ensure high accuracy, our focus is on estimating the pose of a query image relative to the retrieved database images using 2D-2D feature matches. Theoretically, by incorporating the local planar motion constraint into both the estimation of the essential matrix and the triangulation stages, we reduce the minimum required feature matches for absolute pose estimation, thereby enhancing the robustness of outlier rejection. Additionally, we introduce a multiple-checking mechanism to ensure the correctness of the solution throughout the solving process. The efficacy of our approach is substantiated through both qualitative and quantitative assessments on simulated and two real-world datasets evidencing significant improvements in accuracy and robustness provided by our 3D model-free visual localization system. Note to Practitioners-The motivation of this article stems from the need to develop an accurate visual localization system with simplicity and flexibility of map construction and easy adaption to new environments. Such a system holds great practical value for a range of applications, including warehouse robots, service robots, and countless others. Existing visual localization systems that achieve high accuracy are dependent on a pre-built accurate 3D scene map, which pose challenges in terms of map construction and consume significant storage resources onboard, particularly for large scenes. And the aforementioned efforts need to be repeated when changing to a new scene. In this article, an accurate and robust 3D model-free visual localization system is proposed to handle this problem. The map construction is simplified to build a set of database images with associated camera poses, which is trivial as it amounts to adding posed images to a database. The core idea for achieving high accuracy and robustness is to model the local planar motion characteristic of general ground-moving robots into both essential matrix estimation and triangulation stages to obtain two minimal solutions. The proposed localization system simplifies the task of switching between different application scenarios for the robot, reducing additional workload and lowering the difficulty of use.

Visual Autonomy via 2D Matching in Rendered 3D Models

3D Model-free Visual Localization System from Essential Matrix under Local Planar Motion

Leveraging Local Planar Motion Property for Robust Visual Matching and Localization.

2-Entity RANSAC for Robust Visual Localization in Changing Environment

2-Entity Random Sample Consensus for Robust Visual Localization: Framework, Methods, and Verifications

Visual Odometry Based 3D-Reconstruction

Efficient 2D-3D Matching for Multi-Camera Visual Localization

Visual Localization in a Prior 3D LiDAR Map Combining Points and Lines

3D Move to See: Multi-perspective visual servoing for improving object views with semantic segmentation

Real-time 3D mapping using a 2D laser scanner and IMU-aided visual SLAM

The Role of Global Appearance of Omnidirectional Images in Relative Distance and Orientation Retrieval

A Human–Robot Collaborative System for Robust Three-Dimensional Mapping

Monocular Visual Place Recognition in LiDAR Maps via Cross-Modal State Space Model and Multi-View Matching

Visual Odometry through Appearance- and Feature-Based Method with Omnidirectional Images

Spatially Visual Perception for End-to-End Robotic Learning

Towards Localizing Structural Elements: Merging Geometrical Detection with Semantic Verification in RGB-D Data

Sensor Deployment for Visual 3D Perception: A Perspective of Information Gains

Visual Localization in 3D Maps: Comparing Point Cloud, Mesh, and NeRF Representations

Map building and monte carlo localization using global appearance of omnidirectional images

SDVL: Efficient and Accurate Semi-Direct Visual Localization