Abstract:With the prevalence of dual-mode cameras in surveillance systems, visible-infrared person re-identification (VI-ReID) has become an emerging topic. Existing studies of VI-ReID roughly fall into three categories: straightforwardly extracting features, improving loss functions, and conducting visible-infrared modality generation. The generation methods avoid the shortcoming of the former two that training models are generally vulnerable to parameter changes. However, these generation methods are usually based on spatial domain and are unavoidable to damage the original information of images. To tackle these limitations, we propose a novel frequency-domain simulated multispectral (FSMS) modality and visible-FSMS-infrared collaborative learning. FSMS modality consists of three-channel images generated by a channel-level reconstruction of visible images, primarily based on the nonsubsampled contourlet transform (NSCT) cooperating with a lightweight network. The generation exploits crucial spectral information and edge information contained in frequency domain. Then, we design a multi-modality network to conduct the tri-modality collaborative learning where FSMS modality is utilized as an intermediate, thereby preserving the original spatial structure of images. Additionally, a dynamic-weight tri-modality heterogeneous retrieval (THR) loss and a modality-shared classification (MSI) loss are devised to mine discriminative modality-invariant features. A cross-modality invariant (CMI) constraint for further exploring triplet-wise relationships and an intra-modality regularizer for relatively stable convergence are introduced. Finally, experimental results show that our algorithm significantly outperforms the latest state-of-the-arts by 5.7% and 4.4% CMC-1 accuracy on two mainstream benchmark datasets, respectively. And the reasons underlying the observed increase in performance are deeply discussed.

A Three-stage Framework for Video-based Visible-Infrared Person Re-Identification

SR-VIReID: Super Resolution Assisted Visible-Infrared Person Re-Identification

Cooperative Separation of Modality Shared-Specific Features for Visible-Infrared Person Re-Identification

Occluded Visible-Infrared Person Re-Identification

An Efficient Framework for Visible-Infrared Cross Modality Person Re-Identification

Cross-Modality Spatial-Temporal Transformer for Video-Based Visible-Infrared Person Re-Identification

A Visible-Infrared Person Re-Identification Method Based on Meta-Graph Isomerization Aggregation Module

Bridging the Gap: Multi-level Cross-modality Joint Alignment for Visible-infrared Person Re-identification

Joint Color-irrelevant Consistency Learning and Identity-aware Modality Adaptation for Visible-infrared Cross Modality Person Re-identification.

Video-based Visible-Infrared Person Re-Identification with Auxiliary Samples

A Feature Map is Worth a Video Frame: Rethinking Convolutional Features for Visible-Infrared Person Re-identification

Co-segmentation assisted cross-modality person re-identification

Visible-Infrared Person Re-Identification Based on Frequency-Domain Simulated Multispectral Modality for Dual-Mode Cameras

Video-based Visible-Infrared Person Re-Identification via Style Disturbance Defense and Dual Interaction

RBDF: Reciprocal Bidirectional Framework for Visible Infrared Person Reidentification

Feature separation and double causal comparison loss for visible and infrared person re-identification

MIMR: Modality-Invariance Modeling and Refinement for unsupervised visible-infrared person re-identification

Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification

Visible Embraces Infrared: Cross-Modality Person Re-Identification with Single-Modality Supervision

Stronger Heterogeneous Feature Learning for Visible-Infrared Person Re-Identification

High-Order Structure Based Middle-Feature Learning for Visible-Infrared Person Re-identification