Augmented Segmentation and Visualization for Presentation Videos

Alexander Haubold,John R. Kender
DOI: https://doi.org/10.48550/arXiv.cs/0501044
2005-01-21
Abstract:We investigate methods of segmenting, visualizing, and indexing presentation videos by separately considering audio and visual data. The audio track is segmented by speaker, and augmented with key phrases which are extracted using an Automatic Speech Recognizer (ASR). The video track is segmented by visual dissimilarities and augmented by representative key frames. An interactive user interface combines a visual representation of audio, video, text, and key frames, and allows the user to navigate a presentation video. We also explore clustering and labeling of speaker data and present preliminary results.
Multimedia,Information Retrieval
What problem does this paper attempt to address?