Dual Pseudo-Labels Interactive Self-Training for Semi-Supervised Visible-Infrared Person Re-Identification

Jiangming Shi,Yachao Zhang,Xiangbo Yin,Yuan Xie,Zhizhong Zhang,Jianping Fan,Zhongchao Shi,Yanyun Qu
DOI: https://doi.org/10.1109/iccv51070.2023.01030
2023-01-01
Abstract:Visible-infrared person re-identification (VI-ReID) aims to match a specific person from a gallery of images captured from non-overlapping visible and infrared cameras. Most works focus on fully supervised VI-ReID, which requires substantial cross-modality annotation that is more expensive than the annotation in single-modality. To reduce the extensive cost of annotation, we explore two practical semi-supervised settings: uni-semi-supervised (annotating only visible images) and bi-semi-supervised (annotating partially in both modalities). These two semi-supervised settings face two challenges due to the large cross-modality discrepancies and the lack of correspondence supervision between visible and infrared images. Thus, it is difficult to generate reliable pseudo-labels and learn modality-invariant features from noise pseudo-labels. In this paper, we propose a dual pseudo-label interactive self-training (DPIS) for these two semi-supervised VI-ReID. Our DPIS integrates two pseudo-labels generated by distinct models into a hybrid pseudo-label for unlabeled data. However, the hybrid pseudo-label still inevitably contains noise. To eliminate the negative effect of noise pseudo-labels, we introduce three modules: noise label penalty (NLP), noise correspondence calibration (NCC), and unreliable anchor learning (UAL). Specifically, NLP penalizes noise labels, NCC calibrates noisy correspondences, and UAL mines the hard-to-discriminate features. Extensive experimental results on SYSU-MM01 and RegDB demonstrate that our DPIS achieves impressive performance under these two semi-supervised settings.
What problem does this paper attempt to address?