Self-Distillation Feature Learning Network for Optical and SAR Image Registration

Dou Quan,Huiyuan Wei,Shuang Wang,Ruiqi Lei,Baorui Duan,Yi Li,Biao Hou,Licheng Jiao
DOI: https://doi.org/10.1109/TGRS.2022.3173476
IF: 8.2
2022-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Optical and synthetic aperture radar (SAR) image registration is important for multimodal remote sensing image information fusion. Recently, deep matching networks have shown better performances than traditional methods of image matching. However, due to significant differences between optical and SAR images, the performances of existing deep learning methods still need to be further improved. This article proposes a self-distillation feature learning network (SDNet) for optical and SAR image registration, improving performance from network structure and network optimization. First, we explore the impact of different weight-sharing strategies on optical and SAR image matching. Then, we design a partially unshared feature learning network for multimodal image feature learning. It has fewer parameters than the fully unshared network and has more flexibility than the fully shared network. In addition, the limited binary supervised information (matching or nonmatching) is insufficient to train the deep matching networks for optical-SAR image registration. Thus, we propose a self-distillation feature learning method to exploit more similarity information for deep network optimization enhancement, such as the similarity ordering between a series of nonmatching patch pairs. The exploited rich similarity information will significantly enhance network training and improve matching accuracy. Finally, existing deep learning methods brute-force make the matching features of the optical and SAR image patches similar, which will lead to the loss of discriminative information and degeneration of the matching performances. Thus, we build an auxiliary task reconstruction learning to optimize the feature learning network to keep more discriminative information. Extensive experiments demonstrate the effectiveness of our proposed method on multimodal image registration.
What problem does this paper attempt to address?