Abstract:The aim of infrared and visible image fusion is to generate a fused image that not only contains salient targets and rich texture details, but also facilitates high-level vision tasks. However, due to the hardware limitations of digital cameras and other devices, there are more low-resolution images in the existing datasets, and low-resolution images are often accompanied by the problem of losing details and structural information. At the same time, existing fusion algorithms focus too much on the visual quality of the fused images, while ignoring the requirements of high-level vision tasks. To address the above challenges, in this paper, we skillfully unite the super-resolution network, fusion network and segmentation network, and propose a super-resolution-based semantic-aware fusion network. First, we design a super-resolution network based on a multi-branch hybrid attention module (MHAM), which aims to enhance the quality and details of the source image, enabling the fusion network to integrate the features of the source image more accurately. Then, a comprehensive information extraction module (STDC) is designed in the fusion network to enhance the network's ability to extract finer-grained complementary information from the source image. Finally, the fusion network and segmentation network are jointly trained to utilize semantic loss to guide the semantic information back to the fusion network, which effectively improves the performance of the fused images on high-level vision tasks. Extensive experiments show that our method is more effective than other state-of-the-art image fusion methods. In particular, our fused images not only have excellent visual perception effects, but also help to improve the performance of high-level vision tasks.

Adaptive multi-scale semantic fusion network for zero-shot learning

Joint Learning of Attended Zero-Shot Features and Visual-Semantic Mapping.

Dual Collaborative Visual-Semantic Mapping for Multi-Label Zero-Shot Image Recognition

Deep Dual-Stream Network with Scale Context Selection Attention Module for Semantic Segmentation

Multi-level Fusion of Multi-modal Semantic Embeddings for Zero Shot Learning

Multi-modal Generative Adversarial Network for Zero-Shot Learning

Enhanced Multi-Scale Feature Adaptive Fusion Sparse Convolutional Network for Large-Scale Scenes Semantic Segmentation

Semantic Softmax Loss for Zero-Shot Learning

Stacked Semantic-Guided Attention Model for Fine-Grained Zero-Shot Learning.

Meta-Transfer Networks for Zero-Shot Learning

Weakly Supervised Classification Model for Zero‐shot Semantic Segmentation

CHANNEL-WISE MIX-FUSION DEEP NEURAL NETWORKS FOR ZERO-SHOT LEARNING

Multiscale Visual-Attribute Co-Attention for Zero-Shot Image Recognition

A Novel Perspective to Zero-shot Learning: Towards an Alignment of Manifold Structures via Semantic Feature Expansion

AMS-SFE: Towards an Alignment of Manifold Structures via Semantic Feature Expansion for Zero-shot Learning

Deep semantic-aware network for zero-shot visual urban perception

Learning object-centric complementary features for zero-shot learning

A novel spatial-frequency domain network for zero-shot incremental learning

Semantic-Aware Fusion Network Based on Super-Resolution

Multi-Level Semantic Feature Augmentation for One-Shot Learning

MFEAFN: Multi-scale feature enhanced adaptive fusion network for image semantic segmentation