Multibranch Joint Representation Learning Based on Information Fusion Strategy for Cross-View Geo-Localization
Fawei Ge,Yunzhou Zhang,Yixiu Liu,Guiyuan Wang,Sonya Coleman,Dermot Kerr,Li Wang
DOI: https://doi.org/10.1109/tgrs.2024.3378453
IF: 8.2
2024-03-30
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Cross-view geo-localization refers to recognizing images of the same geographic target obtained from different platforms (such as drone-view, satellite-view, and ground-view). However, cross-view geo-localization is challenging as image capture using different platforms coupled with extreme viewpoint variations can cause significant changes to the visual image content. Existing methods mainly focus on mining the fine-grained features or the contextual information in neighboring areas, but ignore the complete information of the entire image and the association of contextual information of adjacent regions. Therefore, a multibranch joint representation learning network model based on information fusion strategies (IFSs) is proposed to solve this cross-view geo-localization problem. First, we obtained feature information from the image through global information fusion (GIF) branch and local information fusion (LIF) branch to help the network learn the discernable information in the different images. In addition, a local-guided-GIF (LGGIF) branch is introduced to make local information assist global features to enhance the learning of potential information in the images. Second, we introduced different IFSs in each branch to increase the extraction of contextual information through expanding the global receptive field, thus improving the performance of the model. Finally, a series of experiments is carried out on four prevailing benchmark datasets, namely University-1652, SUES-200, CVUAS, and CVACT datasets. The quantitative comparisons from the experiments clearly indicate that the proposed network framework has great performance. For example, compared with some state-of-the-art methods, the quantitative improvements of the R@1 and AP on the University-1652 datasets are 1.91%, 2.18%, and 1.55%, 2.99% in both tasks, respectively.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics