Abstract:Infrared and visible image fusion can generate a fusion image with clear texture and prominent goals under extreme conditions. This capability is important for all-day climate detection and other tasks. However, most existing fusion methods for extracting features from infrared and visible images are based on convolutional neural networks (CNNs). These methods often fail to make full use of the salient objects and texture features in the raw image, leading to problems such as insufficient texture details and low contrast in the fused images. To this end, we propose an unsupervised end-to-end Fusion Decomposition Network (FDNet) for infrared and visible image fusion. Firstly, we construct a fusion network that extracts gradient and intensity information from raw images, using multi-scale layers, depthwise separable convolution, and improved convolution block attention module (I-CBAM). Secondly, as the FDNet network is based on the gradient and intensity information of the image for feature extraction, gradient and intensity loss are designed accordingly. Intensity loss adopts the improved Frobenius norm to adjust the weighing values between the fused image and the two raw to select more effective information. The gradient loss introduces an adaptive weight block that determines the optimized objective based on the richness of texture information at the pixel scale, ultimately guiding the fused image to generate more abundant texture information. Finally, we design a single and dual channel convolutional layer decomposition network, which keeps the decomposed image as possible with the input raw image, forcing the fused image to contain richer detail information. Compared with various other representative image fusion methods, our proposed method not only has good subjective vision, but also achieves advanced fusion performance in objective evaluation.

SDTFusion: A split-head dense transformer based network for infrared and visible image fusion

THFuse: An Infrared and Visible Image Fusion Network using Transformer and Hybrid Feature Extractor

SFPFusion: An Improved Vision Transformer Combining Super Feature Attention and Wavelet-Guided Pooling for Infrared and Visible Images Fusion

HitFusion: Infrared and Visible Image Fusion for High-Level Vision Tasks Using Transformer

HDCTfusion: Hybrid Dual-Branch Network Based on CNN and Transformer for Infrared and Visible Image Fusion

TDDFusion: A Target-Driven Dual Branch Network for Infrared and Visible Image Fusion

DATFuse: Infrared and Visible Image Fusion via Dual Attention Transformer

DTFusion: Infrared and Visible Image Fusion Based on Dense Residual PConv-ConvNeXt and Texture-Contrast Compensation

TCCFusion: An Infrared and Visible Image Fusion Method based on Transformer and Cross Correlation

SimpliFusion: a simplified infrared and visible image fusion network

CGTF: Convolution-Guided Transformer for Infrared and Visible Image Fusion

Multi-scale attention-based lightweight network with dilated convolutions for infrared and visible image fusion

Rethinking Cross-Attention for Infrared and Visible Image Fusion

FDNet: An end-to-end fusion decomposition network for infrared and visible images

SADFusion: A multi-scale infrared and visible image fusion method based on salient-aware and domain-specific

YDTR: Infrared and Visible Image Fusion via Y-shape Dynamic Transformer

MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion

ADF‐Net: Attention‐guided deep feature decomposition network for infrared and visible image fusion

HATF: Multi-Modal Feature Learning for Infrared and Visible Image Fusion via Hybrid Attention Transformer

DCFusion: A Dual-Frequency Cross-Enhanced Fusion Network for Infrared and Visible Image Fusion.

When Image Decomposition Meets Deep Learning: A Novel Infrared and Visible Image Fusion Method