Co-Saliency Spatio-Temporal Interaction Network for Person Re-Identification in Videos

Jiawei Liu,Zheng-Jun Zha,Xierong Zhu,Na Jiang
DOI: https://doi.org/10.48550/arxiv.2004.04979
2020-01-01
Abstract:Person re-identification aims at identifying a certain pedestrian acrossnon-overlapping camera networks. Video-based re-identification approaches havegained significant attention recently, expanding image-based approaches bylearning features from multiple frames. In this work, we propose a novelCo-Saliency Spatio-Temporal Interaction Network (CSTNet) for personre-identification in videos. It captures the common salient foreground regionsamong video frames and explores the spatial-temporal long-range contextinterdependency from such regions, towards learning discriminative pedestrianrepresentation. Specifically, multiple co-saliency learning modules withinCSTNet are designed to utilize the correlated information across video framesto extract the salient features from the task-relevant regions and suppressbackground interference. Moreover, multiple spatialtemporal interaction moduleswithin CSTNet are proposed, which exploit the spatial and temporal long-rangecontext interdependencies on such features and spatial-temporal informationcorrelation, to enhance feature representation. Extensive experiments on twobenchmarks have demonstrated the effectiveness of the proposed method.
What problem does this paper attempt to address?