Abstract:This study delves into the problem of moving object detection in infrared and visible images. While existing approaches primarily focus on single-task detection using single-spectral image data, such as thermal infrared or visible images, they often ignore the differences between different spectral images and the interconnectedness of related tasks such as image fusion and segmentation. To tackle these problems, we present a novel multi-spectral image fusion network with quality and semantic awareness for moving object detection (MOD), particularly in scenarios where ground truth labels for both infrared and visible images are unavailable. Our approach fuses multispectral images, incorporating additional subtasks in the image fusion to obtain content and quality perception of infrared and visible images. In addition, we design a novel residual global perception module (RGPM) and multi-spectral fusion loss, which can capture more hidden features and contextual information across various scales. This enhanced capability leads to more precise detection and tracking of moving objects, particularly in challenging situations involving occlusions and dynamic backgrounds. Compared with single-spectral moving object detection optimization, the hurdles of utilizing deep learning for multi-spectral image fusion, e.g., without ground truth labels and harmful noise, are significantly mitigated. Extensive quantitative and qualitative comparative experiments demonstrate its effectiveness, robustness, and superior performance compared to contemporary methods. Concisely, the proposed fusion representation learning has gained 44.2%,31.2%,36.3%,41.6%,5.3%,31.2%,76.2% on EI, SF, DF, AG, MI, SD and Nabf metrics compared with the best competitors.

Informative Data Selection With Uncertainty for Multimodal Object Detection

Informative Data Selection with Uncertainty for Multi-modal Object Detection

MMLF: Multi-modal Multi-class Late Fusion for Object Detection with Uncertainty Estimation

Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion

Multi-spectral Image Fusion for Moving Object Detection

Robust-FusionNet: Deep Multimodal Sensor Fusion for 3-D Object Detection Under Severe Weather Conditions

Modeling Both Intra- and Inter-Modality Uncertainty for Multimodal Fake News Detection

Augmenting 3-D Object Detection Through Data Uncertainty-Driven Auxiliary Framework

Uncertainty-Debiased Multimodal Fusion: Learning Deterministic Joint Representation for Multimodal Sentiment Analysis

Multi-Modal Fusion Based on Depth Adaptive Mechanism for 3D Object Detection

Unified Information Fusion Network for Multi-Modal RGB-D and RGB-T Salient Object Detection

Multimodal Object Detection via Probabilistic a priori Information Integration

Weakly Aligned Feature Fusion for Multimodal Object Detection

Discriminative unimodal feature selection and fusion for RGB-D salient object detection

Long-Tailed Object Detection for Multimodal Remote Sensing Images

Augmenting 3D Object Detection Through Data Uncertainty-driven Auxiliary Framework

Learning Adaptive Fusion Bank for Multi-modal Salient Object Detection

MIMF: Mutual Information-Driven Multimodal Fusion

Multimodal Fusion on Low-quality Data: A Comprehensive Survey

UDNet: Uncertainty-aware Deep Network for Salient Object Detection

Frustum FusionNet: Amodal 3D Object Detection with Multi-Modal Feature Fusion