Inter-Modality Similarity Learning for Unsupervised Multi-Modality Person Re-Identification

Zhiqi Pang,Lingling Zhao,Yang Liu,Gaurav Sharma,Chunyu Wang
DOI: https://doi.org/10.1109/tcsvt.2024.3408831
IF: 5.859
2024-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:RGB (visible), near-infrared (NI), and thermal infrared (TI) imaging modalities are commonly combined for round-the-clock surveillance. We introduce a novel unsupervised multi-modality person re-identification (MM-ReID) task, which, based on an individual’s image in any one modality, seeks to identify matches in the other two modalities. Compared to prior MM-ReID problem formulations, unsupervised MM-ReID significantly reduces labeling cost and imaging constraints. To address the unsupervised MM-ReID task, we propose a novel inter-modality similarity learning (IMSL) framework consisting of four synergistic interconnected modules: modality mean clustering (MMC), multi-modality reliability estimation (MMRE), shape-based mutual reinforcement (SMR), and modality-aware invariant learning (MIL). MMC iterates with SMR and MIL in a mutually beneficial manner to provide pseudo-labels that are robust to modality gap. MMRE normalizes sample weights, mitigating the impact of noisy labels in the multi-modality setting. SMR emphasizes shape information to implicitly enhance the model’s robustness to the modality gap and is additionally guided by pseudo-labels provided by MMC to attend to identity-related details. MIL explicitly encourages learning of modality-invariant and identity-related features via contrastive feedback for the MMC module. Extensive experimental results on the multi-modality and cross-modality datasets demonstrate that IMSL provides substantial performance gains over existing methods. Code is made available at https://github.com/zqpang/IMSL.
What problem does this paper attempt to address?