Map-Free Visual Relocalization Enhanced by Instance Knowledge and Depth Knowledge

Mingyu Xiao,Runze Chen,Haiyong Luo,Fang Zhao,Juan Wang,Xuepeng Ma
2024-09-19
Abstract:Map-free relocalization technology is crucial for applications in autonomous navigation and augmented reality, but relying on pre-built maps is often impractical. It faces significant challenges due to limitations in matching methods and the inherent lack of scale in monocular images. These issues lead to substantial rotational and metric errors and even localization failures in real-world scenarios. Large matching errors significantly impact the overall relocalization process, affecting both rotational and translational accuracy. Due to the inherent limitations of the camera itself, recovering the metric scale from a single image is crucial, as this significantly impacts the translation error. To address these challenges, we propose a map-free relocalization method enhanced by instance knowledge and depth knowledge. By leveraging instance-based matching information to improve global matching results, our method significantly reduces the possibility of mismatching across different objects. The robustness of instance knowledge across the scene helps the feature point matching model focus on relevant regions and enhance matching accuracy. Additionally, we use estimated metric depth from a single image to reduce metric errors and improve scale recovery accuracy. By integrating methods dedicated to mitigating large translational and rotational errors, our approach demonstrates superior performance in map-free relocalization techniques.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges of feature point matching in **Map - Free Visual Relocalization** and the lack of scale information in monocular images. Specifically: 1. **Feature point matching problem**: In Map - Free Visual Relocalization, inaccurate or mis - matched feature points in the scene lead to large rotation errors in camera pose estimation. 2. **Lack of scale information in monocular images**: Monocular images cannot provide absolute scale information, which will lead to large translation errors and affect the accuracy of relocalization. To solve these problems, the author proposes an enhanced Map - Free Visual Relocalization method, which combines Instance Knowledge and Depth Knowledge to improve the robustness of feature point matching and the accuracy of scale recovery. ### Main contributions 1. **Hierarchical matching method**: - By introducing instance segmentation technology, the matching range is limited within a specific instance, thereby improving the precision of feature point matching. Combining global matching information with instance - level matching reduces significant matching errors and enhances local precise matching. - Use an instance segmentation model (such as SegGPT) to extract the same objects in the reference image and the query image and perform fine - grained matching within them. 2. **Depth - enhanced scale recovery**: - Utilize depth estimation technology (such as Metric3D) to obtain reliable depth information from a single image, thereby improving scale recovery and reducing translation errors. - Combine depth information with matching point pairs, calculate 3D coordinates, and use the RANSAC algorithm to select the best scale factor, and finally obtain the scaled translation vector. ### Experimental results The author conducted extensive experiments on the Map - Free Visual Relocalization dataset, and the results show that this method is significantly superior to existing methods in terms of Average Median Pose Error. Specifically: - The Average Median Rotation Error is reduced by approximately 13.593°. - The Average Median Translation Error is reduced by approximately 1.071 meters. In addition, this method also performs well in the other three evaluation metrics (Average Median Reprojection Error, Precision@VCRE < 90px, AUC@VCRE < 90px), further verifying its effectiveness and robustness. ### Conclusion This research proposes a new Map - Free Visual Relocalization method. By combining Instance Knowledge and Depth Knowledge, it effectively solves the challenges in feature point matching and scale recovery and significantly improves the accuracy and robustness of relocalization.