Improving Semantic Scene Categorization by Exploiting Audio-Visual Features

Songhao Zhu,Junchi Yan,Yuncai Liu
DOI: https://doi.org/10.1109/ICIG.2009.17
2009-01-01
Abstract:We address the issue of categorizing scenes from feature films into semantic classifications based on the audio-visual cues. Specifically, we first exploit the grammar of film production to specify the semantic content of scenes. Then, each scene is classified into one of the following categories: conversation, action and suspense. Finally, to achieve more specific scene and consist with human perception, conversation scene is further categorizes into emotional conversation and common one, and action scene is further categorizes into gunfight, beating and chasing scene. This work is a step toward browsing and retrieval content of feature films in limited bandwidth, video repository, and rating of feature films of interest effectively and efficiently.
What problem does this paper attempt to address?