Unsupervised video summarization framework using keyframe extraction and video skimming

Shruti Jadon,Mahmood Jasim
DOI: https://doi.org/10.1109/iccca49541.2020.9250764
2020-10-30
Abstract:Video is a robust sources of information and the consumption of online and offline videos has reached an unprecedented level in the last few years. However, the extraction of information from a video presents more challenges than the extraction of information from a picture. To extract the context of the video, a viewer has to go through the whole video. Apart from context understanding, it almost impossible to create a universal summarized video for everyone as everyone has their own bias of keyframe. Example, in a soccer game, a coach might prefer frames which consist of information on player placement and techniques. However, a person with less knowledge about soccer will focus more on frames which consist of goals and score-board. Therefore, if we were to tackle problem video summarization through a supervised learning path, it will require extensive personalized data labeling. In this paper, we attempt to solve video summarization through unsupervised learning by employing traditional vision-based algorithmic methodologies for accurate feature extraction from video frames. We have also proposed a deep learning based feature extraction followed by multiple clustering methods to find an effective way of summarizing a video by interesting keyframe extraction. We have compared the performance of these approaches on the SumMe dataset and showcased that using deep learning-based feature extraction has been proven to perform better in dynamic viewpoint videos.
What problem does this paper attempt to address?