Effective Similarity Measurement for Video-based Person Re-identification

Yiheng Liu,Chao Xie,Wengang Zhou,Houqiang Li
DOI: https://doi.org/10.1109/vcip.2018.8698685
2018-01-01
Abstract:Learning discriminative spatial-temporal feature representation and distance metric is crucial for video-based person re-identification. Most of current approaches directly use the extracted feature vectors to compute similarity, while a single feature vector is not sufficient enough to overcome the noise caused by background clutters as well as larger variations in poses and viewpoints. To this end, we incorporate learning spatial-temporal feature representation and similarity measurement into a unified framework for video-based person re-identification. We propose a similarity measurement layer, which measures the implicit similarity of two video sequences in different regions. This strategy makes the network more robust to noise. Meanwhile, in order to alleviate the imbalance in the number of positive and negative samples, we propose a matching sampling loss to help training the similarity measurement layer. We extensively conduct comparative experiments on three challenging datasets iLIDS-VID, PRID-2011 and MARS. The experimental results demonstrate that the proposed approach can achieve favorable/superior performance compared with the state-of-the-art methods for the video-based person re-identification.
What problem does this paper attempt to address?