Abstract:Visible-infrared person re-identification (VIReID) has attracted increasing attention due to the requirements for 24-hour intelligent surveillance systems. In this task, one of the major challenges is the modality discrepancy between the visible (VIS) and infrared (NIR) images. Most conventional methods try to design complex networks or generative models to mitigate the cross-modality discrepancy while ignoring the fact that the modality gaps differ between the different VIS and NIR images. Different from existing methods, in this paper, we propose an Adaptive Middle-modality Alignment Learning (AMML) method, which can effectively reduce the modality discrepancy via an adaptive middle modality learning strategy at both image level and feature level. The proposed AMML method enjoys several merits. First, we propose an Adaptive Middle-modality Generator (AMG) module to reduce the modality discrepancy between the VIS and NIR images from the image level, which can effectively project the VIS and NIR images into a unified middle modality image (UMMI) space to adaptively generate middle-modality (M-modality) images. Second, we propose a feature-level Adaptive Distribution Alignment (ADA) loss to force the distribution of the VIS features and NIR features adaptively align with the distribution of M-modality features. Moreover, we also propose a novel Center-based Diverse Distribution Learning (CDDL) loss, which can effectively learn diverse cross-modality knowledge from different modalities while reducing the modality discrepancy between the VIS and NIR modalities. Extensive experiments on three challenging VIReID datasets show the superiority of the proposed AMML method over the other state-of-the-art methods. More remarkably, our method achieves 77.8% in terms of Rank-1 and 74.8% in terms of mAP on the SYSU-MM01 dataset for all search mode, and 86.6% in terms of Rank-1 and 88.3% in terms of mAP on the SYSU-MM01 dataset for indoor search mode. The code is released at: https://github.com/ZYK100/MMN.

AVPL: Augmented Visual Perception Learning for Person Re-identification and Beyond

Joining Features by Global Guidance with Bi-Relevance Trihard Loss for Person Re-Identification

Contribution-Based Multi-Stream Feature Distance Fusion Method with ${k}$ -Distribution Re-Ranking for Person Re-Identification

Person Re-identification Based on Transform Algorithm

Exploring Part-Informed Visual-Language Learning for Person Re-Identification

Learning Progressive Modality-shared Transformers for Effective Visible-Infrared Person Re-identification

Adaptive Middle Modality Alignment Learning for Visible-Infrared Person Re-identification

Unsupervised Visible-Infrared Person ReID by Collaborative Learning with Neighbor-Guided Label Refinement

Deep Learning for Person Re-identification: A Survey and Outlook

Diffusion Augmentation and Pose Generation Based Pre-Training Method for Robust Visible-Infrared Person Re-Identification

VLUReID: Exploiting Vision-Language Knowledge for Unsupervised Person Re-Identification

Stronger Heterogeneous Feature Learning for Visible-Infrared Person Re-Identification

MvHAAN: multi-view hierarchical attention adversarial network for person re-identification

Boosting Person Re-Identification with Viewpoint Contrastive Learning and Adversarial Training

Video-based Person Re-identification with Long Short-Term Representation Learning

Perceive Where to Focus: Learning Visibility-aware Part-level Features for Partial Person Re-identification

Progressive Discriminative Feature Learning for Visible-Infrared Person Re-Identification

Modality Blur and Batch Alignment Learning for Twin Noisy Labels-based Visible–infrared Person Re-identification

A comprehensive survey of visible infrared person re-identification from an application perspective

BV-Person: A Large-scale Dataset for Bird-view Person Re-identification