Detecting Story-Related Subject Captions in Chinese News Videos Using Spatio-Temporal Analysis

Xihan Wang,Xiaoyi Feng,Zhaoqiang Xia,Abdenour Hadid
DOI: https://doi.org/10.1109/icspcc.2016.7753639
2016-01-01
Abstract:Story-related subject captions (SSCs) are the texts which are added to news videos to summarize the story. Compared to all types of caption texts (channel logo, scene locations, scrolling marquee, time, speaker names, audio subtitles, dates etc.), only SSCs describe the news stories and this is useful in news video analysis and retrieval systems. Different from most existing methods in the literature which mainly focus on detecting all types of caption texts in static images, we propose an efficient approach for detecting story-related subject captions using spatio-temporal analysis. We design various types of spatial and temporal features that are intrinsic to SSCs and robust to different variations. Stroke Width Transform (SWT) is first applied to compute connected components. These components are then clustered into chains with similar geometric properties and colors, yielding in candidate text captions. These candidates are finally refined using spatial and temporal analysis. A new challenging data set with ground truth and evaluation protocol is built and will be made publicly available for research purposes. Our experimental analysis shows that our proposed algorithm yields in promising results which compare favorably against traditional approaches in the literature.
What problem does this paper attempt to address?