Person re‐identification via deep compound eye network and pose repair module
Hongjian Gu,Wenxuan Zou,Keyang Cheng,Bin Wu,Humaira Abdul Ghafoor,Yongzhao Zhan
DOI: https://doi.org/10.1049/cvi2.12282
IF: 1.484
2024-04-06
IET Computer Vision
Abstract:The following contributions are made: A deep compound eye network via multi‐camera logical topology is designed to achieve the high accuracy of the global person re‐identification task. The network utilises graph convolution technology to process time‐delay‐related pedestrian video sequences. Finally, pedestrian features are matched through Siamese networks to realise global person re‐identification. A pose repair network based on spatio‐temporal information aggregation is designed to address the problem of pedestrians being blocked through the auxiliary information under the adjacent camera. The target pedestrian features under the second‐order logic topology camera are used as auxiliary information for pose repair, and the whole‐to‐whole matching of pedestrians is finally achieved. A joint optimisation mechanism of the compound eye network and pose repair network is introduced to deal with the problem that the multi‐camera logical topology network and person re‐identification network cannot converge synchronously during training. Multi‐camera logical topology inference provides auxiliary information for pose repair and retrieval order for pedestrian matching. Additionally, pedestrian matching results are utilised as feedback to modify the logical topology inference. Person re‐identification is aimed at searching for specific target pedestrians from non‐intersecting cameras. However, in real complex scenes, pedestrians are easily obscured, which makes the target pedestrian search task time‐consuming and challenging. To address the problem of pedestrians' susceptibility to occlusion, a person re‐identification via deep compound eye network (CEN) and pose repair module is proposed, which includes (1) A deep CEN based on multi‐camera logical topology is proposed, which adopts graph convolution and a Gated Recurrent Unit to capture the temporal and spatial information of pedestrian walking and finally carries out pedestrian global matching through the Siamese network; (2) An integrated spatial‐temporal information aggregation network is designed to facilitate pose repair. The target pedestrian features under the multi‐level logic topology camera are utilised as auxiliary information to repair the occluded target pedestrian image, so as to reduce the impact of pedestrian mismatch due to pose changes; (3) A joint optimisation mechanism of CEN and pose repair network is introduced, where multi‐camera logical topology inference provides auxiliary information and retrieval order for the pose repair network. The authors conducted experiments on multiple datasets, including Occluded‐DukeMTMC, CUHK‐SYSU, PRW, SLP, and UJS‐reID. The results indicate that the authors' method achieved significant performance across these datasets. Specifically, on the CUHK‐SYSU dataset, the authors' model achieved a top‐1 accuracy of 89.1% and a mean Average Precision accuracy of 83.1% in the recognition of occluded individuals.
computer science, artificial intelligence,engineering, electrical & electronic