Abstract:Video content is present in an ever-increasing number of fields, both scientific and commercial. Sports, particularly soccer, is one of the industries that has invested the most in the field of video analytics, due to the massive popularity of the game and the emergence of new markets. Previous state-of-the-art methods on soccer matches video summarization rely on handcrafted heuristics to generate summaries which are poorly generalizable, but these works have yet proven that multiple modalities help detect the best actions of the game. On the other hand, machine learning models with higher generalization potential have entered the field of summarization of general-purpose videos, offering several deep learning approaches. However, most of them exploit content specificities that are not appropriate for sport whole-match videos. Although video content has been for many years the main source for automatizing knowledge extraction in soccer, the data that records all the events happening on the field has become lately very important in sports analytics, since this event data provides richer context information and requires less processing. We propose a method to generate the summary of a soccer match exploiting both the audio and the event metadata. The results show that our method can detect the actions of the match, identify which of these actions should belong to the summary and then propose multiple candidate summaries which are similar enough but with relevant variability to provide different options to the final editor. Furthermore, we show the generalization capability of our work since it can transfer knowledge between datasets from different broadcasting companies, different competitions, acquired in different conditions, and corresponding to summaries of different lengths

Multimodal feature extraction and fusion for semantic mining of soccer video: a survey

Creating Personalized Video Summaries Via Semantic Event Detection

Semantic event detection via multimodal data mining

ANALYSIS OF VISION BASED SYSTEMS TO DETECT REAL TIME GOAL EVENTS IN SOCCER VIDEOS

Semantic Event Extraction From Basketball Games Using Multi-Modal Analysis

Video-based Analysis of Soccer Matches

Event Detection In Basketball Video Using Multiple Modalities

A decision tree-based multimodal data mining framework for soccer goal detection

Deep Understanding of Soccer Match Videos

Event Analysis in Soccer Video by Dynamic Programming Based Fusion of Multiple Modalities

A Fusion Scheme of Visual and Auditory Modalities for Event Detection in Sports Video.

Automatic Summarization of Soccer Highlights Using Audio-visual Descriptors

A Multi-stage deep architecture for summary generation of soccer videos

Automated soccer event detection and highlight generation for short and long views

On Semantic Annotation for Sports Video Highlights by Mining User Comments from Live Broadcast Social Network

Towards Universal Soccer Video Understanding

Sports Video Mining with Mosaic

An Automatic Multi-Camera-based Event Extraction System for Real Soccer Videos.

A mid-level representation framework for semantic sports video analysis.

Multi-Mode Semantic Cues Based on Hidden Conditional Random Field in Soccer Video

Video Data Mining: Semantic Indexing and Event Detection from the Association Perspective