Hierarchical Latent Concept Discovery for Video Event Detection
Chao Li,Zi Huang,Yang,Jiewei Cao,Xiaoshuai Sun,Heng Tao Shen
DOI: https://doi.org/10.1109/tip.2017.2670782
IF: 10.6
2017-01-01
IEEE Transactions on Image Processing
Abstract:Semantic information is important for video event detection. How to automatically discover, model, and utilize semantic information to facilitate video event detection has been a challenging problem. In this paper, we propose a novel hierarchical video event detection model, which deliberately unifies the processes of underlying semantics discovery and event modeling from video data. Specially, different from most of the approaches based on manually pre-defined concepts, we devise an effective model to automatically uncover video semantics by hierarchically capturing latent static-visual concepts in frame-level and latent activity concepts (i.e., temporal sequence relationships of static-visual concepts) in segment-level. The unified model not only enables a discriminative and descriptive representation for videos, but also alleviates error propagation problem from video representation to event modeling existing in previous methods. A max-margin framework is employed to learn the model. Extensive experiments on four challenging video event datasets, i.e., MED11, CCV, UQE50, and FCVID, have been conducted to demonstrate the effectiveness of the proposed method.