DIMGNet: A Transformer-based Network for Pedestrian Reidentification with Multi-granularity Information Mutual Gain
Runmin Wang,Zhenlin Zhu,Yanbin Zhu,Hua Chen,Yongzhong Liao,Ziyu Zhu,Yajun Ding,Changxin Gao,Nong Sang
DOI: https://doi.org/10.1109/tmm.2024.3352896
IF: 7.3
2024-01-01
IEEE Transactions on Multimedia
Abstract:Pedestrian reidentification (ReID) is a challenging task that involves identifying and retrieving specific pedestrians across different cameras and scenes. This problem has significant implications for security surveillance, and has thus received substantial attention in recent years. However, traditional convolutional neural networks (CNNs) have limited receptive fields and cannot capture global information. Moreover, transformer networks, which excel in long-range feature capture, are prone to accuracy degradation due to loss of details. To address these limitations, we propose a transformer-based pedestrian ReID network with double-branch information mutual gain (DIMGNet), which leverages hierarchical parallel levels to support multi-granularity feature information mutual gain. Our model also incorporates an auxiliary camera information (ACI) module to improve feature representation ability. We further embed a cross-attention mechanism into the architecture to enhance mutual gain between multi-granularity features and improve feature discrimination. Finally, we introduce a shuffling technique to increase the robustness of the extracted features. We evaluate the proposed method on several benchmark datasets, including Market-1501[1], MSMT17[2], DukeMTMC-reID [3], and Occluded-Duke [4], achieving $mAP$ values of 90.7%, 68.4%, 83.7%, and 60.6%, respectively. Our method outperforms most state-of-the-art methods, demonstrating the effectiveness of our method. The code will be publicly released athttps://github.com/ZhenlinZhu/DIMGNet.
computer science, information systems,telecommunications, software engineering