A Statistics-Based Method For Video Semantic Analysis

Wei Wei,Zhen X. Yue,Min Huang
DOI: https://doi.org/10.1109/ICMLC.2007.4370405
2007-01-01
Abstract:Based on statistics theory, a generic framework for video semantic content analysis is proposed in this paper. Multilayer semantic analysis and multimodal information fusion are unified in the same modal. Firstly, frame-segment key-frame strategy and attention selection model are used to concisely represent video content. With pattern classification technique, the basic visual semantics are recognized. Then, a multilayer structure modal is used to extract multi-level visual semantics. After that, an audio semantic analysis scheme is presented with the spectrum feature extracted by Fourier transform algorithm. Finally, a bionic multimodal fusion method with two level structures for video semantic concept analysis is proposed. Experiment results demonstrate the framework could fuse multimodal feature, extract semantic in different granularity and bridge semantic gap to some extent.
What problem does this paper attempt to address?