MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps

Hang Wu,Zhenghao Zhang,Siyuan Lin,Xiangru Mu,Qiang Zhao,Ming Yang,Tong Qin
2024-07-11
Abstract:Robust localization is the cornerstone of autonomous driving, especially in challenging urban environments where GPS signals suffer from multipath errors. Traditional localization approaches rely on high-definition (HD) maps, which consist of precisely annotated landmarks. However, building HD map is expensive and challenging to scale up. Given these limitations, leveraging navigation maps has emerged as a promising low-cost alternative for localization. Current approaches based on navigation maps can achieve highly accurate localization, but their complex matching strategies lead to unacceptable inference latency that fails to meet the real-time demands. To address these limitations, we propose a novel transformer-based neural re-localization method. Inspired by image registration, our approach performs a coarse-to-fine neural feature registration between navigation map and visual bird's-eye view features. Our method significantly outperforms the current state-of-the-art OrienterNet on both the nuScenes and Argoverse datasets, which is nearly 10%/20% localization accuracy and 30/16 FPS improvement on single-view and surround-view input settings, separately. We highlight that our research presents an HD-map-free localization method for autonomous driving, offering cost-effective, reliable, and scalable performance in challenging driving environments.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the paper attempts to solve The paper "MapLocNet: Coarse - to - Fine Feature Registration for Visual Re - Localization in Navigation Maps" aims to solve the robust localization problem in autonomous driving, especially when GPS signals are affected by multipath errors in urban environments. Traditional localization methods rely on high - definition (HD) maps, which contain precisely annotated landmarks, but the cost of constructing and maintaining these maps is high and it is difficult to scale them up on a large scale. Therefore, using navigation maps as a low - cost alternative for localization has become a promising approach. However, although existing navigation - map - based localization methods can achieve high - precision localization, their complex matching strategies lead to unacceptable inference delays and cannot meet real - time requirements. For this reason, the authors propose a new Transformer - based neural re - localization method - MapLocNet. This method aligns the navigation map with the visual bird - eye - view features through coarse - to - fine neural feature registration, thereby significantly improving the inference speed while ensuring high precision. Specifically, the main contributions of MapLocNet are as follows: 1. **Propose MapLocNet**: By fusing surround - view images and navigation maps, high - precision localization is achieved, especially suitable for areas with poor GPS signals, solving the significant position drift problem. 2. **Introduce a hierarchical coarse - to - fine feature registration strategy**: Effectively align bird - eye - view (BEV) features and map features, achieving significant improvements in both localization accuracy and inference speed compared to existing methods. 3. **Develop new training criteria**: Use perception tasks as auxiliary targets for pose prediction, making MapLocNet achieve the state - of - the - art localization accuracy on the nuScenes and Argoverse datasets. Overall, this research provides a reliable, efficient, and scalable localization method without the need for high - definition maps, which is suitable for complex driving environments.