MCFusion: infrared and visible image fusion based multiscale receptive field and cross-modal enhanced attention mechanism

Min Jiang,Zhiyuan Wang,Jun Kong,Danfeng Zhuang
DOI: https://doi.org/10.1117/1.jei.33.1.013039
IF: 0.829
2024-02-13
Journal of Electronic Imaging
Abstract:Infrared and visible image fusion aims to generate a more comprehensive image for improving measurement and analysis accuracy. Existing Swin Transformer based methods effectively extract features from medium-large scale receptive fields. However, these methods ignore the complementary role of features extracted from the small receptive field, resulting in the loss of essential fine information. To this end, we propose MCFusion, a dual-branch framework based on a multiscale receptive field and cross-modal enhanced attention mechanism. MCFusion is composed of an attention-guided coarse branch (AGCB) and a fine branch (FineB). First, AGCB and FineB are designed to extract distinct yet complementary features from medium-large and small receptive fields, respectively. Second, a cross-modal enhanced attention mechanism is designed to enhance shared features across different modalities. Third, a loss function fully considering contrast, texture, and illumination intensity is proposed to generate fused images. The test results demonstrate that MCFusion is superior to other state-of-the-art methods and can notably enhance the detection and recognition of targets.
engineering, electrical & electronic,optics,imaging science & photographic technology
What problem does this paper attempt to address?