SpaGBOL: Spatial-Graph-Based Orientated Localisation

Tavis Shore,Oscar Mendez,Simon Hadfield
2024-09-24
Abstract:Cross-View Geo-Localisation within urban regions is challenging in part due to the lack of geo-spatial structuring within current datasets and techniques. We propose utilising graph representations to model sequences of local observations and the connectivity of the target location. Modelling as a graph enables generating previously unseen sequences by sampling with new parameter configurations. To leverage this newly available information, we propose a GNN-based architecture, producing spatially strong embeddings and improving discriminability over isolated image embeddings. We outline SpaGBOL, introducing three novel contributions. 1) The first graph-structured dataset for Cross-View Geo-Localisation, containing multiple streetview images per node to improve generalisation. 2) Introducing GNNs to the problem, we develop the first system that exploits the correlation between node proximity and feature similarity. 3) Leveraging the unique properties of the graph representation - we demonstrate a novel retrieval filtering approach based on neighbourhood bearings. SpaGBOL achieves state-of-the-art accuracies on the unseen test graph - with relative Top-1 retrieval improvements on previous techniques of 11%, and 50% when filtering with Bearing Vector Matching on the SpaGBOL dataset.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the problem of Cross-View Geo-Localisation (CVGL) in urban areas. Specifically, existing CVGL techniques mainly rely on sparse street view-satellite image pairs and ignore the structural relationships between geographic locations. This leads to poor performance when dealing with partially overlapping or new sequences that combine multiple existing sequences. Additionally, existing datasets and techniques lack geospatial structure, limiting their feasibility in real-world applications. To address these issues, the authors propose the following key points: 1. **Utilizing graph structure to represent data**: By organizing image data into a graph network, it is possible to better model the connectivity of sequences of local observations and target locations. The graph structure makes it possible to generate new unseen sequences and leverage spatial relationships between nodes to improve the robustness and distinctiveness of feature representations. 2. **Introducing Graph Neural Networks (GNN)**: A GNN-based architecture is developed to generate embedding vectors with strong spatial features, thereby enhancing the discriminative power of single image embeddings. 3. **Proposing a new retrieval filtering method**: Retrieval filtering based on Bearing Vector Matching (BVM) is introduced to further improve retrieval accuracy. 4. **Constructing a new dataset**: The SpaGBOL dataset is released, which is a multi-city graph structure dataset containing multiple street view images, aimed at improving the model's generalization ability under different time, weather, and lighting conditions. Through these innovations, the authors hope to push CVGL technology closer to practical applications, especially in dense urban areas where GNSS signal loss is common. Experimental results show that SpaGBOL improves the Top-1 retrieval accuracy on unseen test images by 11% compared to existing techniques, and the accuracy is further increased by 50% after using BVM filtering.