Abstract:The fusion of infrared and visible images aims to synthesize an image that combines the advantageous characteristics of both image sources. Nevertheless, existing image fusion algorithms often struggle to extract relevant feature information from the source images and find it challenging to balance the significance of different modality information within the source images. This leads to suboptimal fusion results and inadequately serve advanced downstream visual tasks. In order to confront this challenge, a novel image fusion algorithm, denoted as Multiple Information Supervised Progressive Fusion Network (MISP-Fuse), has been proposed. Specifically, MISP-Fuse employs an innovative and robust multi-stage encoder-decoder network called Full Scale Feature Residual network (FSFR) to extract spatial context and detailed feature information corresponding to infrared and visible images. Within this encoder-decoder network, source image feature information is progressively extracted at different stages, which are recombined at multiple scales and distilled for crucial information. Finally, the fusion image is generated through a spatial localization module named as Spatial Localization Network (SLNet). MISP-Fuse incorporates a multi-information supervision mechanism to establish a linkage between different modality information in the infrared and visible source images. It ensures that the resulting fused image not only aligns with human visual perception but also effectively serves advanced downstream visual tasks. The comparative experiments utilizing diverse image fusion benchmark datasets has been conducted. In comparison to other algorithms, MISP-Fuse demonstrated significant enhancements in comprehensive image fusion metrics, including Average Gradient (AG), Sum of Correlated Differences (SCD), and Correlation Coefficient (CC).

MADMFuse: A Multi-Attribute Diffusion Model to Fuse Infrared and Visible Images

MIFFuse: A Multi-Level Feature Fusion Network for Infrared and Visible Images

MAFusion: Multiscale Attention Network for Infrared and Visible Image Fusion

Dif-Fusion: Towards High Color Fidelity in Infrared and Visible Image Fusion with Diffusion Models

FusionDiff: Multi-focus image fusion using denoising diffusion probabilistic models

DATFuse: Infrared and Visible Image Fusion via Dual Attention Transformer

SADFusion: A multi-scale infrared and visible image fusion method based on salient-aware and domain-specific

Diff-IF: Multi-modality image fusion via diffusion model with fusion knowledge prior

MMDRFuse: Distilled Mini-Model with Dynamic Refresh for Multi-Modality Image Fusion

[Leptin and feeding regulation].

Infrared and Visible Image Fusion Based on Filtering Enhancement

Integrating Parallel Attention Mechanisms and Multi-Scale Features for Infrared and Visible Image Fusion

DCFusion: Difference correlation-driven fusion mechanism of infrared and visible images

FSADFuse: A Novel Fusion Approach to Infrared and Visible Images

DCAFuse: Dual-Branch Diffusion-CNN Complementary Feature Aggregation Network for Multi-Modality Image Fusion

MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion

SMFD: an end-to-end infrared and visible image fusion model based on shared-individual multi-scale feature decomposition

Fusion of Infrared and Visible Images based on Spatial-Channel Attentional Mechanism

MISP-Fuse: A progressive fusion network guided by Multi-Information supervision

MaeFuse: Transferring Omni Features with Pretrained Masked Autoencoders for Infrared and Visible Image Fusion via Guided Training

MEEAFusion: Multi-Scale Edge Enhancement and Joint Attention Mechanism Based Infrared and Visible Image Fusion