Abstract:Unsupervised learning visible-infrared person re-identification (USL-VI-ReID) offers a more flexible and cost-effective alternative compared to supervised methods. This field has gained increasing attention due to its promising potential. Existing methods simply cluster modality-specific samples and employ strong association techniques to achieve instance-to-cluster or cluster-to-cluster cross-modality associations. However, they ignore cross-camera differences, leading to noticeable issues with excessive splitting of identities. Consequently, this undermines the accuracy and reliability of cross-modal associations. To address these issues, we propose a novel Dynamic Modality-Camera Invariant Clustering (DMIC) framework for USL-VI-ReID. Specifically, our DMIC naturally integrates Modality-Camera Invariant Expansion (MIE), Dynamic Neighborhood Clustering (DNC) and Hybrid Modality Contrastive Learning (HMCL) into a unified framework, which eliminates both the cross-modality and cross-camera discrepancies in clustering. MIE fuses inter-modal and inter-camera distance coding to bridge the gaps between modalities and cameras at the clustering level. DNC employs two dynamic search strategies to refine the network's optimization objective, transitioning from improving discriminability to enhancing cross-modal and cross-camera generalizability. Moreover, HMCL is designed to optimize instance-level and cluster-level distributions. Memories for intra-modality and inter-modality training are updated using randomly selected samples, facilitating real-time exploration of modality-invariant representations. Extensive experiments have demonstrated that our DMIC addresses the limitations present in current clustering approaches and achieve competitive performance, which significantly reduces the performance gap with supervised methods.

Shallow-Deep Collaborative Learning for Unsupervised Visible-Infrared Person Re-Identification

Unsupervised Visible-Infrared Person ReID by Collaborative Learning with Neighbor-Guided Label Refinement

Inter-Intra Modality Knowledge Learning and Clustering Noise Alleviation for Unsupervised Visible-Infrared Person Re-Identification

Deep Siamese Network with Multi-level Similarity Perception for Person Re-identification

Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID

Progressive Contrastive Learning with Multi-Prototype for Unsupervised Visible-Infrared Person Re-identification

Dynamic Modality-Camera Invariant Clustering for Unsupervised Visible-Infrared Person Re-identification

Cross-modality Hierarchical Clustering and Refinement for Unsupervised Visible-Infrared Person Re-Identification

Learning Commonality, Divergence and Variety for Unsupervised Visible-Infrared Person Re-identification

Learning Progressive Modality-shared Transformers for Effective Visible-Infrared Person Re-identification

Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification

Cross-Modality Person Re-identification with Memory-Based Contrastive Embedding

Co-segmentation assisted cross-modality person re-identification

Translation, Association and Augmentation: Learning Cross-Modality Re-Identification From Single-Modality Annotation

Cross-modality disentanglement and shared feedback learning for infrared-visible person re-identification

Cooperative Separation of Modality Shared-Specific Features for Visible-Infrared Person Re-Identification

Mutual Information Guided Optimal Transport for Unsupervised Visible-Infrared Person Re-identification

Text-augmented Multi-Modality contrastive learning for unsupervised visible-infrared person re-identification

Joint Color-irrelevant Consistency Learning and Identity-aware Modality Adaptation for Visible-infrared Cross Modality Person Re-identification.

Robust Pseudo-label Learning with Neighbor Relation for Unsupervised Visible-Infrared Person Re-Identification

Progressive Discriminative Feature Learning for Visible-Infrared Person Re-Identification