Abstract:Video summarization facilitates rapid browsing and efficient video indexing in many video browsing website applications, such as sport video highlights, dynamic video cover. In these applications, it is most important to generate user video summaries that capture interesting video content that users prefer. While many existing methods generate video summaries based on low-level features, this paper first proposes to mine large-scale Flickr images and find "interest" and "non-interest" images from Flickr for the same query to learn what is of interest to users. Unlike existing pairwise ranking-based methods for video summarization, we then propose an improved triplet deep ranking model that is easier to converge to learn the relationship between "interest" and "non-interest" Flickr images, and exploit what visual content of the original video is indeed preferred by users. In the training process, triplets (interest image p+, interest image p '+, non-interest image p '') are selected as input to train a model with three parallel deep convolutional networks. In the video summarization process, an efficient entropy-based video segmentation method is proposed for dividing the original video into segments and the visual interest scores of the segments are estimated using the trained ranking network for summarization (SumNet). Then, an optimal subset of the segments is selected to create a summary capturing interesting visual content. We evaluate and compare our method with several state-of-the-art methods, experimental results show that our method achieves an improvement over the best baseline method by 9.6% in terms of mean Average Precision (mAP) accuracy.

Probabilistic Skimlets Fusion for Summarizing Multiple Consumer Landmark Videos

A Novel Compact Yet Rich Key Frame Creation Method for Compressed Video Summarization

Creating Memorable Video Summaries That Satisfy the User's Intention for Taking the Videos.

Memorable and Rich Video Summarization

A Human-Machine Collaborative Video Summarization Framework Using Pupillary Response Signals

Creating Personalized Video Summaries Via Semantic Event Detection

Learning User Interest with Improved Triplet Deep Ranking and Web-Image Priors for Topic-Related Video Summarization.

An Interactive Personalized Video Summarization Based on Sketches.

An Unsupervised Video Summarization Method Based on Multimodal Representation.

An Effective Video Summarization Framework Toward Handheld Devices

Perceptual Attributes Optimization For Multivideo Summarization

From Thumbnails to Summaries - A single Deep Neural Network to Rule Them All

Unsupervised video summarization framework using keyframe extraction and video skimming

Multi-View Video Summarization

Large Model based Sequential Keyframe Extraction for Video Summarization

Video Summarization Based on User Log Enhanced Link Analysis

Outlier-attenuating Summarization for User-Generated-video.

A General Framework for Edited Video and Raw Video Summarization

Highlight Detection With Pairwise Deep Ranking For First-Person Video Summarization

Conditional Modeling Based Automatic Video Summarization

Video Summarization Based on Mutual Information and Entropy Sliding Window Method