Spatial-temporal Fusion Network with Residual Learning and Attention Mechanism: A Benchmark for Video-Based Group Re-ID

Qiling Xu,Hua Yang,Lin Chen
DOI: https://doi.org/10.1007/978-3-030-31654-9_42
2019-01-01
Abstract:Video-based group re-identification (Re-ID) remains to be a meaningful task under rare study. Group Re-ID contains the information of the relationship between pedestrians, while the video sequences provide more frames to identify the person. In this paper, we propose a spatial-temporal fusion network for the group Re-ID. The network composes of the residual learning played between the CNN and the RNN in a unified network, and the attention mechanism which makes the system focus on the discriminative features. We also propose a new group Re-ID dataset DukeGroupVid to evaluate the performance of our spatial-temporal fusion network. Comprehensive experimental results on the proposed dataset and other video-based datasets, PRID-2011, i-LIDS-VID and MARS, demonstrate the effectiveness of our model.
What problem does this paper attempt to address?