Cross modality person re-identification via mask-guided dynamic dual-task collaborative learning

Wenbin Shao,Yujie Liu,Wenxin Zhang,Zongmin Li
DOI: https://doi.org/10.1007/s10489-024-05344-x
IF: 5.3
2024-03-09
Applied Intelligence
Abstract:Visible-infrared cross modality person re-identification (CM-ReID) has received extensive attention on the community due to its profound applicability for 24-h scene surveillance. The huge modality discrepancy makes it very susceptible to background clutter, especially for infrared images. In this paper, we propose a mask-guided dynamic dual-task collaborative learning (MG-DDCL) method to extract background irrelevant pedestrian representation. A dynamic dual-task collaborative learning strategy is proposed to extract pedestrian representation and generate foreground masks by a unified convolutional neural network. This strategy improved the map by 0.95% and improved the Rank-1 by 1.9%. To make the guidance mask to facilitate the cross modality person re-identification task, we modify the hard-mask produced by semantic segmentation into the friendly soft-mask and generate foreground response map by the regression learning manner. Compared with the classification manner, our method has significant advantages. Extensive experiments conducted on two datasets SYSU-MM01 and RegDB demonstrate the effectiveness of the proposed method.
computer science, artificial intelligence
What problem does this paper attempt to address?