Local Feature for Visible-Thermal PReID Based on Transformer

Quanyi Pu,Changan Yuan,Hongjie Wu,Xingming Zhao
DOI: https://doi.org/10.1007/978-3-031-13870-6_29
2022-01-01
Abstract:Person re-identification based on infrared image and RGB image is a cross-modality pedestrian recognition, which is a challenging task. The traditional goal of person re-identification is to find a given person's image from an image database, often from a single modality database. In real applications, there are often multiple modalities of data. Traditional single modality tasks have limitations. Cross-modality person re-identification needs to extract features from RGB and infrared images. In our work, we take advantage of both global and local features. First, we use a dual-path VIT structure to extract features from RGB images and infrared images, respectively. Secondly, we cut the local features in the spatial direction and input the shared VIT layer to learn the local features. The loss function consists of Identity loss, Triplet loss, and Center loss. The model can capture shared features between modality and improve cross-modality similarity. Finally, we performed experiments on two datasets, SYSU-MM01 and RegDB, and compared them with other methods in recent studies.
What problem does this paper attempt to address?