Abstract:Unsupervised visible-infrared person re-identification (US-VI-ReID) aims at learning a cross-modality matching model under unsupervised conditions, which is an extremely important task for practical nighttime surveillance to retrieve a specific identity. Previous advanced US-VI-ReID works mainly focus on associating the positive cross-modality identities to optimize the feature extractor by off-line manners, inevitably resulting in error accumulation of incorrect off-line cross-modality associations in each training epoch due to the intra-modality and inter-modality discrepancies. They ignore the direct cross-modality feature interaction in the training process, i.e., the on-line representation learning and updating. Worse still, existing interaction methods are also susceptible to inter-modality differences, leading to unreliable heterogeneous neighborhood learning. To address the above issues, we propose a dual consistency-constrained learning framework (DCCL) simultaneously incorporating off-line cross-modality label refinement and on-line feature interaction learning. The basic idea is that the relations between cross-modality instance-instance and instance-identity should be consistent. More specifically, DCCL constructs an instance memory, an identity memory, and a domain memory for each modality. At the beginning of each training epoch, DCCL explores the off-line consistency of cross-modality instance-instance and instance-identity similarities to refine the reliable cross-modality identities. During the training, DCCL finds credible homogeneous and heterogeneous neighborhoods with on-line consistency between query-instance similarity and query-instance domain probability similarities for feature interaction in one batch, enhancing the robustness against intra-modality and inter-modality variations. Extensive experiments validate that our method significantly outperforms existing works, and even surpasses some supervised counterparts. The source code is available at https://github.com/yangbincv/DCCL .

Counterfactual Attention Alignment for Visible-Infrared Cross-Modality Person Re-Identification

Learning Concordant Attention Via Target-aware Alignment for Visible-Infrared Person Re-identification

Joint Color-irrelevant Consistency Learning and Identity-aware Modality Adaptation for Visible-infrared Cross Modality Person Re-identification.

A Spatial-Channel Multi-Attention Parallel Network for Visible-Infrared Person Re-identification

Co-Attentive Lifting for Infrared-Visible Person Re-Identification

Discovering attention-guided cross-modality correlation for visible–infrared person re-identification

Cascaded Cross-modal Alignment for Visible-Infrared Person Re-Identification

Cross-Modality Person Re-identification with Memory-Based Contrastive Embedding

Visible Infrared Cross-Modality Person Re-Identification Network Based on Adaptive Pedestrian Alignment

Dual adaptive alignment and partitioning network for visible and infrared cross-modality person re-identification

Bridging the Gap: Multi-level Cross-modality Joint Alignment for Visible-infrared Person Re-identification

DMA: Dual Modality-Aware Alignment for Visible-Infrared Person Re-Identification

Feature separation and double causal comparison loss for visible and infrared person re-identification

Translation, Association and Augmentation: Learning Cross-Modality Re-Identification From Single-Modality Annotation

Cross-modality disentanglement and shared feedback learning for infrared-visible person re-identification

Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification

An Effective Visible-Infrared Person Re-identification Network Based on Second-Order Attention and Mixed Intermediate Modality.

Multi-Stage Auxiliary Learning for Visible-Infrared Person Re-identification

Dual Consistency-Constrained Learning for Unsupervised Visible-Infrared Person Re-Identification

Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID

AMC-Net: Attentive Modality-Consistent Network for Visible-Infrared Person Re-Identification.