Audio and Video Combined for Home Video Abstraction

M Zhao,JJ Bu,C Chen
DOI: https://doi.org/10.1109/icassp.2003.1200046
2004-01-01
Abstract:With the increasing number of people who can afford to make videos to record their lives, home videos play a more and more important role in multimedia. Video abstraction is an efficient way to help review such a huge amount of home videos. A home video abstraction technique combining audio and video features is presented. The audio contents are firstly classified as silence, pure speech, non-pure speech, music and background sound using support vector machines (SVMs). Then, non-pure speech is further classified into song and other non-pure speech using SVM, and background sound is classified into laughter, applause, scream and others using hidden Markov models (HMMs). For video contents, motion level and blur degree are acquired. Finally, video segments containing special features, such as speech, laughter, song, applause, scream, and specified motion level and blur degree, are extracted as the main parts of the abstract. The remaining parts of the abstract are generated using key frame information. Experimental results show that the proposed algorithm can extract the desired parts of a home video to generate satisfactory video abstracts.
What problem does this paper attempt to address?