Abstract:Cross-View Geo-Localisation within urban regions is challenging in part due to the lack of geo-spatial structuring within current datasets and techniques. We propose utilising graph representations to model sequences of local observations and the connectivity of the target location. Modelling as a graph enables generating previously unseen sequences by sampling with new parameter configurations. To leverage this newly available information, we propose a GNN-based architecture, producing spatially strong embeddings and improving discriminability over isolated image embeddings. We outline SpaGBOL, introducing three novel contributions. 1) The first graph-structured dataset for Cross-View Geo-Localisation, containing multiple streetview images per node to improve generalisation. 2) Introducing GNNs to the problem, we develop the first system that exploits the correlation between node proximity and feature similarity. 3) Leveraging the unique properties of the graph representation - we demonstrate a novel retrieval filtering approach based on neighbourhood bearings. SpaGBOL achieves state-of-the-art accuracies on the unseen test graph - with relative Top-1 retrieval improvements on previous techniques of 11%, and 50% when filtering with Bearing Vector Matching on the SpaGBOL dataset.

What problem does this paper attempt to address?

The paper attempts to address the problem of Cross-View Geo-Localisation (CVGL) in urban areas. Specifically, existing CVGL techniques mainly rely on sparse street view-satellite image pairs and ignore the structural relationships between geographic locations. This leads to poor performance when dealing with partially overlapping or new sequences that combine multiple existing sequences. Additionally, existing datasets and techniques lack geospatial structure, limiting their feasibility in real-world applications. To address these issues, the authors propose the following key points: 1. **Utilizing graph structure to represent data**: By organizing image data into a graph network, it is possible to better model the connectivity of sequences of local observations and target locations. The graph structure makes it possible to generate new unseen sequences and leverage spatial relationships between nodes to improve the robustness and distinctiveness of feature representations. 2. **Introducing Graph Neural Networks (GNN)**: A GNN-based architecture is developed to generate embedding vectors with strong spatial features, thereby enhancing the discriminative power of single image embeddings. 3. **Proposing a new retrieval filtering method**: Retrieval filtering based on Bearing Vector Matching (BVM) is introduced to further improve retrieval accuracy. 4. **Constructing a new dataset**: The SpaGBOL dataset is released, which is a multi-city graph structure dataset containing multiple street view images, aimed at improving the model's generalization ability under different time, weather, and lighting conditions. Through these innovations, the authors hope to push CVGL technology closer to practical applications, especially in dense urban areas where GNSS signal loss is common. Experimental results show that SpaGBOL improves the Top-1 retrieval accuracy on unseen test images by 11% compared to existing techniques, and the accuracy is further increased by 50% after using BVM filtering.

SpaGBOL: Spatial-Graph-Based Orientated Localisation

BEV-CV: Birds-Eye-View Transform for Cross-View Geo-Localisation

BDLoc: Global Localization from 2.5D Building Map

Learning Large-scale Location Embedding From Human Mobility Trajectories with Graphs

From Satellite to Ground: Satellite Assisted Visual Localization with Cross-view Semantic Matching

SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs

GOReloc: Graph-based Object-Level Relocalization for Visual SLAM

PEnG: Pose-Enhanced Geo-Localisation

Long-term Visual Map Sparsification with Heterogeneous GNN

GSLoc: Visual Localization with 3D Gaussian Splatting

LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images

An Improved Visual SLAM Algorithm Based on Graph Neural Network

Graph sampling based deep metric learning for cross-view geo-localization

Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network

Integrating multi-modal information to detect spatial domains of spatial transcriptomics by graph attention network

SpaNCMG: improving spatial domains identification of spatial transcriptomics using neighborhood-complementary mixed-view graph convolutional network

SeGCN: A Semantic-Aware Graph Convolutional Network for UAV Geo-Localization

Where We Are and What We're Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes

CurriculumLoc: Enhancing Cross-Domain Geolocalization Through Multistage Refinement

S3E-GNN: Sparse Spatial Scene Embedding with Graph Neural Networks for Camera Relocalization

CurriculumLoc: Enhancing Cross-Domain Geolocalization through Multi-Stage Refinement