MEFusion: Unsupervised Mutual Enhancement for Multimodal Image Fusion

Yushe Cao,Siwen Jiao,Penghao Sun,Baoyun Peng,Dianxi Shi,Yuanchun Shi
DOI: https://doi.org/10.3233/faia240533
2024-01-01
Abstract:Image fusion aims to extract valuable information from each modality to create a fused image. Currently, state-of-the-art image fusion approaches tend to initially decompose each modality into distinct yet complementary features, and transfer beneficial information through carefully hand-crafted or learned fusion rules to the target. Nevertheless, previous approaches treat each modality in isolation before fusion, potentially under-utilising the complementary information available across modalities. In this word, we introduce a novel method called MEFusion that pioneers cross-modality mutual enhancement before feature decomposition. By harnessing the individual strengths of each modality, MEFusion elevates the overall quality and comprehensiveness of the fusion outcome. To facilitate a bidirectional enhancement for each feature across modalities, we have designed a pluggable co-attention mechanism that seamlessly integrates into a lightweight dual-path transformer. Furthermore, to enrich the details of each modality, we propose an unsupervised cross-modality mutual enhancement loss, which overcomes the limitations of requiring paired training data for enhancement tasks. Extensive experiments conducted on several benchmark datasets demonstrate the superiority of our proposed MEFusion method in terms of traditional fusion metrics and perceptual quality improvement of fused images.
What problem does this paper attempt to address?