M2CNet: an Infrared and Visible Image Fusion Method Based on Dual Marginal Contrastive Learning

He Ye,Zhiqiang Zhou,Yuhao Wang,Weiyi Chen,Lingjuan Miao,Jiaqi Li
DOI: https://doi.org/10.1109/ccdc62350.2024.10587987
2024-01-01
Abstract:Since there is no ground truth for infrared and visible image fusion, where there are huge differences in appearance and content between the source images, most fusion methods are based on unsupervised learning. However, current used losses in unsupervised learning, i.e., pixel losses and structural losses, do not accurately characterize the modal differences between source images. And the complex losses, which combines multiple losses, show poor robustness on various datasets and lead to difficulty in model convergence. In addition, most methods struggle to achieve a satisfactory information balance between source images. To solve these challenges, we proposed a fusion method based on dual marginal contrastive learning, namely M2CNet. First, we propose a novel contrastive loss with dual margin penalties to enhance the ability of building cross-modal connections between fusion image and source image. It is important to note that our approach exclusively utilizes this succinct loss to address the fusion task without currently used losses. Second, we propose a three-branch network to fuse extensive complementary information from source images and make an excellent trade-off between them. Qualitative and quantitative experiments demonstrate the superiority of our method over the state-of-the-art methods.
What problem does this paper attempt to address?