End-to-end soccer video scene and event classification with deep transfer learning

Yuxi Hong,Chen Ling,Zuochang Ye
DOI: https://doi.org/10.1109/ISACV.2018.8369043
2018-01-01
Abstract:Soccer video scene and event classification are two essential tasks for the soccer video semantic analysis and have attracted many interests of researchers because of their importance and practicability. However most proposed methods solve these two tasks separately. In order to solve two tasks at the same time and improve the efficiency of video processing, we treat them as one end-to-end classification task. We introduce a new Soccer Video Scene and Event Dataset (SVSED) with six categories from the scenes and events, which contains 600 video clips. Then, we show that frame features extracted from pretrained CNN model of different categories are separable in 3-D space. Finally, we construct a CNN model for the classification task and deep transfer learning method is used for optimizing classification task result considering relative small training datasets. We fine-tuned several state-of-art CNN models and achieves accuracy above 89% within several minutes training.
What problem does this paper attempt to address?