Abstract:Unsupervised visible-infrared person re-identification (US-VI-ReID) aims at learning a cross-modality matching model under unsupervised conditions, which is an extremely important task for practical nighttime surveillance to retrieve a specific identity. Previous advanced US-VI-ReID works mainly focus on associating the positive cross-modality identities to optimize the feature extractor by off-line manners, inevitably resulting in error accumulation of incorrect off-line cross-modality associations in each training epoch due to the intra-modality and inter-modality discrepancies. They ignore the direct cross-modality feature interaction in the training process, i.e., the on-line representation learning and updating. Worse still, existing interaction methods are also susceptible to inter-modality differences, leading to unreliable heterogeneous neighborhood learning. To address the above issues, we propose a dual consistency-constrained learning framework (DCCL) simultaneously incorporating off-line cross-modality label refinement and on-line feature interaction learning. The basic idea is that the relations between cross-modality instance-instance and instance-identity should be consistent. More specifically, DCCL constructs an instance memory, an identity memory, and a domain memory for each modality. At the beginning of each training epoch, DCCL explores the off-line consistency of cross-modality instance-instance and instance-identity similarities to refine the reliable cross-modality identities. During the training, DCCL finds credible homogeneous and heterogeneous neighborhoods with on-line consistency between query-instance similarity and query-instance domain probability similarities for feature interaction in one batch, enhancing the robustness against intra-modality and inter-modality variations. Extensive experiments validate that our method significantly outperforms existing works, and even surpasses some supervised counterparts. The source code is available at https://github.com/yangbincv/DCCL .

Multi-Stage Auxiliary Learning for Visible-Infrared Person Re-identification

Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification

Visible-Infrared Person Re-Identification Based on Frequency-Domain Simulated Multispectral Modality for Dual-Mode Cameras

Adaptive Middle Modality Alignment Learning for Visible-Infrared Person Re-identification

Cooperative Separation of Modality Shared-Specific Features for Visible-Infrared Person Re-Identification

Feature separation and double causal comparison loss for visible and infrared person re-identification

Towards Homogeneous Modality Learning and Multi-Granularity Information Exploration for Visible-Infrared Person Re-Identification

Modality Bias Calibration Network Via Information Disentanglement for Visible–Infrared Person Reidentification

Dual adaptive alignment and partitioning network for visible and infrared cross-modality person re-identification

An Effective Visible-Infrared Person Re-identification Network Based on Second-Order Attention and Mixed Intermediate Modality.

Cross-modality disentanglement and shared feedback learning for infrared-visible person re-identification

Progressive Discriminative Feature Learning for Visible-Infrared Person Re-Identification

Joint Color-irrelevant Consistency Learning and Identity-aware Modality Adaptation for Visible-infrared Cross Modality Person Re-identification.

Visible-infrared Person Re-Identification Via Specific and Shared Representations Learning

Co-segmentation assisted cross-modality person re-identification

Global-Local Multiple Granularity Learning for Cross-Modality Visible-Infrared Person Reidentification

Augmented Dual-Contrastive Aggregation Learning for Unsupervised Visible-Infrared Person Re-Identification

Syncretic Modality Collaborative Learning for Visible Infrared Person Re-Identification

Dual Consistency-Constrained Learning for Unsupervised Visible-Infrared Person Re-Identification

Inter-Intra Modality Knowledge Learning and Clustering Noise Alleviation for Unsupervised Visible-Infrared Person Re-Identification