Multi-layer multi-view topic model for classifying advertising video.

Sujuan Hou,Ling Chen,Dacheng Tao,Shangbo Zhou,Wenjie Liu,Yuanjie Zheng
DOI: https://doi.org/10.1016/j.patcog.2017.03.003
IF: 8
2017-01-01
Pattern Recognition
Abstract:We proposed a new video representation from the perspective of multiple-view fusion, by using multi-layer multi-view LDA framework.The proposed video representation scheme solved the problem of ads video classification effectively.We built a publicly advertisement video dataset, which can be shared in the research of video advertising. The recent proliferation of advertising (ad) videos has driven the research in multiple applications, ranging from video analysis to video indexing and retrieval. Among them, classifying ad video is a key task because it allows automatic organization of videos according to categories or genres, and this further enables ad video indexing and retrieval. However, classifying ad video is challenging compared to other types of video classification because of its unconstrained content. While many studies focus on embedding ads relevant to videos, to our knowledge, few focus on ad video classification. In order to classify ad video, this paper proposes a novel ad video representation that aims to sufficiently capture the latent semantics of video content from multiple views in an unsupervised manner. In particular, we represent ad videos from four views, including bag-of-feature (BOF), vector of locally aggregated descriptors (VLAD), fisher vector (FV) and object bank (OB). We then devise a multi-layer multi-view topic model, mlmv_LDA, which models the topics of videos from different views. A topical representation for video, supporting category-related task, is finally achieved by the proposed method. Our empirical classification results on 10,111 real-world ad videos demonstrate that the proposed approach effectively differentiate ad videos.
What problem does this paper attempt to address?