Abstract:Person re-identification (Re-ID) is an important problem in video surveillance for matching pedestrian images across non-overlapping camera views. Currently, most works focus on RGB-based Re-ID. However, RGB images are not well suited to a dark environment; consequently, infrared (IR) imaging becomes necessary for indoor scenes with low lighting and 24-h outdoor scene surveillance systems. In such scenarios, matching needs to be performed between RGB images and IR images, which exhibit different visual characteristics; this cross-modality matching problem is more challenging than RGB-based Re-ID due to the lack of visible colour information in IR images. To address this challenge, we study the RGB-IR cross-modality Re-ID (RGB-IR Re-ID) problem. Rather than applying existing cross-modality matching models that operate under the assumption of identical data distributions between training and testing sets to handle the discrepancy between RGB and IR modalities for Re-ID, we cast learning shared knowledge for cross-modality matching as the problem of cross-modality similarity preservation. We exploit same-modality similarity as the constraint to guide the learning of cross-modality similarity along with the alleviation of modality-specific information, and finally propose a Focal Modality-Aware Similarity-Preserving Loss. To further assist the feature extractor in extracting shared knowledge, we design a modality-gated node as a universal representation of both modality-specific and shared structures for constructing a structure-learnable feature extractor called Modality-Gated Extractor. For validation, we construct a new multi-modality Re-ID dataset, called SYSU-MM01, to enable wider study of this problem. Extensive experiments on this SYSU-MM01 dataset show the effectiveness of our method. Download link of dataset: <a href="https://github.com/wuancong/SYSU-MM01">https://github.com/wuancong/SYSU-MM01</a>.

Improving RGB-Infrared Object Detection by Reducing Cross-Modality Redundancy

Removal and Selection: Improving RGB-Infrared Object Detection via Coarse-to-Fine Fusion

RGB-IR Person Re-identification by Cross-Modality Similarity Preservation

Lightweight Cross-Modal Information Mutual Reinforcement Network for RGB-T Salient Object Detection

Modality-Adaptive Mixup and Invariant Decomposition for RGB-Infrared Person Re-Identification

Towards RGB-NIR Cross-modality Image Registration and Beyond

A Similarity Inference Metric for RGB-Infrared Cross-Modality Person Re-identification

Improving RGB-infrared object detection with cascade alignment-guided transformer

Interactive Context-Aware Network for RGB-T Salient Object Detection

Lightweight Spatial Sliced-Concatenate-Multireceptive-Field Enhance and Joint Channel Attention Mechanism for Infrared Object Detection

Cross-Modality Double Bidirectional Interaction and Fusion Network for RGB-T Salient Object Detection

RGB-T salient object detection via CNN feature and result saliency map fusion

SIA: RGB-T Salient Object Detection Network with Salient-Illumination Awareness

An Interactively Reinforced Paradigm for Joint Infrared-Visible Image Fusion and Saliency Object Detection

Multi-modality information refinement fusion network for RGB-D salient object detection

CSFuser: A Cascade Siamese Fusion Architecture for RGB-Infrared Object Detection

Unified Information Fusion Network for Multi-Modal RGB-D and RGB-T Salient Object Detection

Cross-Modality Attentive Feature Fusion for Object Detection in Multispectral Remote Sensing Imagery

RGB-T Salient Object Detection Via Fusing Multi-level CNN Features.

Rethinking Early-Fusion Strategies for Improved Multispectral Object Detection

Mitigating Modality Discrepancies for RGB-T Semantic Segmentation