Attribute Vision Transformer for UAV-Human Re-Identification

Hao Ni,Yuke Li,Ping Lai,Pengpeng Zeng,Hangyu Guo,Lianli Gao
DOI: https://doi.org/10.1109/icmew63481.2024.10645408
2024-01-01
Abstract:Person re-identification is a challenging task that involves recognizing the same individual across different cameras and from various angles. Unlike other ReID datasets, the UAV-Human dataset includes annotations for pedestrian attributes, providing a richer set of information for model training and evaluation. To fully exploit this additional attribute information, we introduce an Attribute Vision Transformer (A-ViT). This model integrates attribute tokens with image tokens, thereby enhancing the ReID performance. Furthermore, we employ two ranking list optimization strategies: Multi-Query Search (MQS) and Multi-Model Ensemble (MME), to improve the accuracy of the ranking list. MQS enables the joint retrieval of multiple images of the same probe, utilizing multi-view information to yield more accurate and reliable results. In contrast, MME amalgamates the expertise of various specialized models to provide a more comprehensive and thorough retrieval outcome. We conducted extensive experiments on the UAV-Human dataset to assess the impact of these techniques on model accuracy. Our proposed solution surpasses state-of-the-art methods, achieving mean Average Precision (mAP) and Rank-1 scores of 82.2% and 81.3%, respectively. The code implementation can be found at: https://github.com/liyuke65535/MMVRAC-reid.
What problem does this paper attempt to address?