Fusing Geometric and Scene Information for Cross-View Geo-Localization.

Siyuan Guo,Tianying Liu,Wengen Li,Jihong Guan,Shuigeng Zhou
DOI: https://doi.org/10.1145/3511808.3557633
2022-01-01
Abstract:Cross-view geo-localization is to match scene images (e.g. ground-view images) with geo-tagged aerial images, which is crucial to a wide range of applications such as autonomous driving and street view navigation. Existing methods can neither address the perspective difference well nor effectively capture the scene information. In this work, we propose a Geometric and Scene Information Fusion (GSIF) model for more accurate cross-view geo-localization. GSIF first learns the geometric information of scene images and aerial images via log-polar transformation and spatial-attention aggregation to alleviate the perspective difference. Then, it mines the scene information of scene images via Sky View Factor (SVF) extraction. Finally, both geometric information and scene information are fused for image matching, and a balanced loss function is introduced to boost the matching accuracy. Experimental results on two real datasets show that our model can significantly outperforms the existing methods.
What problem does this paper attempt to address?