Intra- and Inter-Modal Graph Attention Network and Contrastive Learning for SAR and Optical Image Registration

Xin Hu,Yan Wu,Xingyu Liu,Zhikang Li,Zhifei Yang,Ming Li
DOI: https://doi.org/10.1109/tgrs.2023.3328368
IF: 8.2
2023-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:The registration of synthetic aperture radar (SAR) and optical images is challenging due to their significant radiometric and geometric differences. Recently, popular registration algorithms based on convolutional neural networks (CNNs) have been limited to extracting local features, resulting in low registration accuracy. In this article, we propose a novel intra- and inter-modal graph attention network and contrastive learning (I2M-GAN&CL) for SAR and optical image registration to solve this problem. First, the graph construction is conducted according to positional encoding (PE), local features, and $k$ -nearest neighbor (KNN) edges of keypoints from SAR and optical images. Second, based on the local features extracted by CNNs, an intra- and inter-modal graph attention network (I2MGAN) is designed. The I2MGAN mines context information and extracts global features shared between SAR and optical images, mitigating the influence of geometric and radiometric differences on registration results. The graph cross-attention (GCA) layer in I2MGAN extracts global features shared between the two images via message passing between nodes across graphs. Subsequently, the graph self-attention (GSA) layer in I2MGAN aggregates context information by conveying messages between nodes in one graph. Finally, a novel intra- and inter-modal contrastive learning (I2MCL) strategy is developed. This strategy conducts the contrastive learning of local and global features within and across modalities to explore feature similarity and increase the number of detected matching point pairs. Experimental results on the publicly available OS dataset demonstrate that the number of matching point pairs and registration accuracy of the proposed algorithm outperforms existing state-of-the-art algorithms.
What problem does this paper attempt to address?