MV-MOE: A Visual Mixture-of-Experts Model for Optical-SAR Image Matching

Jingyi Cao,Yanan You,Jun Liu
DOI: https://doi.org/10.1109/igarss53475.2024.10641296
2024-01-01
Abstract:Optical and Synthetic Aperture Radar (SAR) matching produces spatial and semantic correspondences of the input images, playing a pivotal role in the registration process. However, due to the difference in radiation characteristics and geometric properties, even the same target may manifest distinctive morphological and feature expressions in the cross-modal images. Consistent feature extraction remains a challenge for optical-SAR image matching. Therefore, based on the salient image patterns (keypoint, line, and block), a Visual Mixture-of-Experts method for optical-SAR image Matching (MV-MOE) is proposed. It facilitates adaptive graphical representation of multi-modal images across various scenes through the multi-task learning framework. With the aid of the attention mechanism, the task-related basic features are reconstructed into matching-related features, yielding similarity along with spatial offset vectors. Additionally, we employ a multi-level feature extraction backbone based on the visual retentive block, enhancing local feature perception with the preservation capability of the recurrent network structure. Experiments demonstrate the advantages of the proposed method on multi-modal image matching and its contribution to the subsequent registration.
What problem does this paper attempt to address?