Unsupervised Video Summarization Based on Consistent Clip Generation

Xin Ai,Yan Song,Zechao Li
DOI: https://doi.org/10.1109/bigmm.2018.8499188
2018-01-01
Abstract:It becomes increasingly convenient for people to shoot, store and share videos of their daily life on social networks, which makes it increasingly difficult to find desired video content from the massive video data. Therefore, it is necessary to develop automatic video summarization methods. Previous methods focus on category-specific videos and build various complex models, and recent deep learning approaches need a large amount of annotated data to train the network. This paper proposes a new unsupervised video summarization method, which selects a group of highlight clips with self-consistency. Specifically, we propose a consistent clip generation method, i.e. the cutting-merging-adjusting scheme, by exploring the clip similarity and the local similarity. The consistent clips are obtained by merging similar clips iteratively and adjusting the boundaries of each consistent clip to remove the inconsistency of the boundaries between clips and logical events. Then, we estimate the importance score of each consistent clip by computing the interestingness score of its frames, based on which we select the top important clips to generate a video summary. Experimental results show that our method is able to generate high-quality summaries which are closer to human perception, compared to several existing methods.
What problem does this paper attempt to address?