Abstract:When infrared targets are located at the edge of an image or when the targets are relatively small, the standard infrared and visible image fusion algorithm becomes a major problem because it relies on manually designed strategies and low-level image statistics for saliency detection. To address this issue, SeGFuison is proposed. It is a semantic saliency guided infrared and visible image fusion method composed of an autoencoder, a fusion layer, and a Semantic Segmentation-based Deep Saliency model (SSDS). It focuses on the structural information of images and generates saliency maps at the feature level, so that infrared targets can be extracted more accurately, thereby avoiding the introduction of artifacts and noise in fusion images. Incorporating saliency maps dynamically generated by SSDS, our approach effectively guides the training process of the fusion model. This strategic utilization guarantees that the resulting fused image maintains a saliency map that closely resembles that of the original infrared image. Furthermore, saliency maps are employed to partition images into distinct regions, namely target areas and background areas. This segmentation enables the design of distinct loss functions tailored to the unique characteristics of each area. As a result, our approach ensures the fusion of images preserves both salient targets and intricate background details, thus upholding a comprehensive depiction of fusion information. Through rigorous experimentation conducted on widely recognized public datasets including TNO, RoadScene, and MSRS, our algorithm has exhibited distinct advantages over contemporary state-of-the-art algorithms, both in terms of objective metrics and subjective evaluations. Notably, SeGFusion attains remarkable scores on key indicators such as FMI, VIF, and SD, affirming its superiority. Furthermore, it excels in subjective assessments, producing fused images of unparalleled clarity. The obtained experimental results compellingly showcase the inherent potential of our proposed algorithm, thereby substantiating its viability for diverse applications within fields such as infrared instruments and equipment.

GRDATFusion: A gradient residual dense and attention transformer infrared and visible image fusion network for smart city security systems in cloud and fog computing

A multi‐focus image fusion network deployed in smart city target detection

IETAFusion: An illumination enhancement and target‐aware infrared and visible image fusion network for security system of smart city

DTFusion: Infrared and Visible Image Fusion Based on Dense Residual PConv-ConvNeXt and Texture-Contrast Compensation

CGTF: Convolution-Guided Transformer for Infrared and Visible Image Fusion

FDNet: An end-to-end fusion decomposition network for infrared and visible images

DATFuse: Infrared and Visible Image Fusion via Dual Attention Transformer

SCGRFuse: An infrared and visible image fusion network based on spatial/channel attention mechanism and gradient aggregation residual dense blocks

SDTFusion: A split-head dense transformer based network for infrared and visible image fusion

SeGFusion: A semantic saliency guided infrared and visible image fusion method

TDDFusion: A Target-Driven Dual Branch Network for Infrared and Visible Image Fusion

DCFusion: A Dual-Frequency Cross-Enhanced Fusion Network for Infrared and Visible Image Fusion.

GTMFuse: Group-Attention Transformer-Driven Multiscale Dense Feature-Enhanced Network for Infrared and Visible Image Fusion

An Improved Infrared and Visible Image Fusion Using an Adaptive Contrast Enhancement Method and Deep Learning Network with Transfer Learning

A Multi-Stage Visible and Infrared Image Fusion Network Based on Attention Mechanism

HDCTfusion: Hybrid Dual-Branch Network Based on CNN and Transformer for Infrared and Visible Image Fusion

IR-MSDNet: Infrared and Visible Image Fusion Based On Infrared Features and Multiscale Dense Network

GLFuse: A Global and Local Four-Branch Feature Extraction Network for Infrared and Visible Image Fusion

YDTR: Infrared and Visible Image Fusion via Y-shape Dynamic Transformer

SADFusion: A multi-scale infrared and visible image fusion method based on salient-aware and domain-specific

HSFusion: A high-level vision task-driven infrared and visible image fusion network via semantic and geometric domain transformation