Creating Memorable Video Summaries That Satisfy the User's Intention for Taking the Videos.

Mengjuan Fei,Wei Jiang,Weijie Mao
DOI: https://doi.org/10.1016/j.neucom.2017.10.030
IF: 6
2017-01-01
Neurocomputing
Abstract:Video summarization facilitates rapid browsing and efficient video indexing in many applications. A good user video summary should be interesting and satisfy the user's intention for taking the video. While many existing methods generate trailers from conventionally constructed videos based on low-level features, this paper focuses on creating interesting video summaries from user videos using three semantic features, namely memorability, snap points, and motion. In the proposed framework, we first apply a novel memorability-based video segmentation method. We use a large image dataset with annotated memorability to fine-tune a Hybrid-AlexNet and predict the memorability scores of all video images using the fine-tuned deep network. Next, we estimate the visual importance scores of the segments based on their memorability scores, snap point scores, and motion scores. We then select an optimal subset of the segments to create a memorable and interesting summary that contains image sequences intentionally taken by the video capturer. Finally, we evaluate our method using the SumMe dataset, which contains 15 - 18 human-labeled video summaries. The experimental results demonstrate that our method generates high-quality video summaries that are comparable to human-created summaries and those created by state-of-the-art methods. (C) 2017 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?