Key Frame Extraction Based on Frame Difference and Cluster for Person Re-identification
Yiyin Ding,Shaoqi Hou,Xu Yang,Wenyi Du,Chunyu Wang,Guangqiang Yin
DOI: https://doi.org/10.1109/swc50871.2021.00085
2021-01-01
Abstract:The features extracted by person re-identification (Re-ID) contain both spatial dimension information and time dimension information, but for video data sets, it is first necessary to select key frames as network input. In this process, problems such as data redundancy and lack often occur which make the selected key frames not representative and reduce the expressiveness of the network. At present, in using traditional features to select key frames, there are mainly methods based on SIFT, histogram, and spatio-temporal color distribution; in terms of key frame selection strategies, there are mainly algorithms based on Random, Evenly, and Frame difference. The method based on SIFT and histogram cannot accurately extract feature points for targets with smooth edges, and the amount of information loss is large; the method based on spatio-temporal color distribution requires extremely high video color. If the video resolution of the camera is poor, it is easy to loss the color distributed information; the key frames selected based on Random and Fvenly methods have great randomness; the method based on Frame difference can filter out most of the video frames with highly similar features, alleviating the problem of data redundancy. But the problem of data imbalance still exists. In response to the above problems, this paper proposes a new key frame extraction method frame difference and cluster (FDC) that integrates the idea of K-means clustering. First, the paired distance list of the difference image is calculated, and then the pre-selected frame set is generated. Then, the pre-selected frame set is unsupervised clustering, and finally the target key frame set is obtained. With the FDC method, we extract the key frame which can best represent the dynamic information of pedestrians in the video data set. This not only reduces the frame input, but also improves the computational efficiency. On the premise of ensuring the accuracy of data, we use relatively few samples to train better models on different network structures. On the mars data set, compared to the traditional Evenly and Frame difference extraction algorithms, FDC has been improved by 1.6% and 2.4% respectively.