A Three-stage Framework for Video-based Visible-Infrared Person Re-Identification

Wei Hou,Wenxuan Wang,Yiming Yan,Di Wu,Qingyu Xia
DOI: https://doi.org/10.1109/lsp.2024.3394676
2024-01-01
IEEE Signal Processing Letters
Abstract:Video-based visible-infrared person re-identification (VI-ReID) aims to identify the same suspicious person captured by different sensors at different times and scenes. Overcoming the modality discrepancy caused by the different imaging principles of visible and infrared videos is a crucial challenge in this field. To address the above challenge, we propose a three-stage framework for video-based VI-ReID to fully exploit the identity information contained in different modality data, which named Decomposition-Mining- Aggregation framework. Our framework consists of three stages: decomposition, mining, and aggregation stage. Experimental results demonstrate that the combination of the three stages can effectively enhance the feature extraction network's ability to extract identity information from images with modality differences. In addition, we also illustrate the influence of video frame sampling and exploration strategy on the quality of VI-ReID model through experiments. Extensive experimental results on the HITSZ-VCM dataset demonstrate that our framework achieves the best accuracy in the VI-ReID task.
What problem does this paper attempt to address?