Abstract:The goal of cross-view image based geo-localization is to determine the location of a given street-view image by matching it with a collection of geo-tagged aerial images, which has important applications in the fields of remote sensing information utilization and augmented reality. Most current cross-view image based geo-localization methods focus on the image content and ignore the relations between feature nodes, resulting in insufficient mining of effective information. To address this problem, this study proposes feature relation guided cross-view image based geo-localization. This method first processes aerial remote sensing images using a polar transform to achieve the geometric coarse alignment of ground-to-aerial images, and then realizes local contextual feature concern and global feature correlation modeling of the images through the feature relation guided attention generation module designed in this study. Specifically, the module includes two branches of deformable convolution based multiscale contextual feature extraction and global spatial relations mining, which effectively capture global structural information between feature nodes at different locations while correlating contextual features and guiding global feature attention generation. Finally, a novel feature aggregation module, MixVPR, is introduced to aggregate global feature descriptors to accomplish image matching and localization. After experimental validation, the cross-view image based geo-localization algorithm proposed in this study yields results of 92.08%, 97.70%, and 98.66% for the top 1, top 5, and top 10 metrics, respectively, in CVUSA, a popular public cross-view dataset, and exhibits superior performance compared to algorithms of the same type.

Multi-support-region Image Descriptors and Its Application to Street Landmark Localization.

A Deformable Local Image Descriptor

Local Interest Region Description Using Multiple Support Regions

Visual Localizer: Outdoor Localization Based on ConvNet Descriptor and Global Optimization for Visually Impaired Pedestrians

Robust Local Feature Descriptor for Multisource Remote Sensing Image Registration

Generic Image Manipulation Localization Through the Lens of Multi-scale Spatial Inconsistence

Localization, balance and affinity: a stronger multifaceted collaborative salient object detector in remote sensing images

Local Descriptor for Robust Place Recognition using LiDAR Intensity

SSA-Net: Spatial Scale Attention Network for Image-Based Geo-Localization

SMRD: A Local Feature Descriptor for Multi-modal Image Registration

CMLocate: A Cross‐modal Automatic Visual Geo‐localization Framework for a Natural Environment Without GNSS Information

Socio-Mobile Landmark Recognition Using Local Features with Adaptive Region Selection

Deformable Registration of Ultrasound and Magnetic Resonance Imaging Using A New Self-Similarity Based Neighborhood Descriptor

Feature Relation Guided Cross-View Image Based Geo-Localization

Learning a Robust Hybrid Descriptor for Robot Visual Localization

Visual place recognition using landmark distribution descriptors

MDCS with Fully Encoding the Information of Local Shape Description for 3D Rigid Data Matching

Shape-Adaptive Modality Independent Region Descriptor for Multimodal Remote Sensing Image Matching

Landmark Image Retrieval by Jointing Feature Refinement and Multimodal Classifier Learning.

Multi-sensor Image Registration by Combining Local Self-Similarity Matching and Mutual Information

Hierarchical visual localization for visually impaired people using multimodal images