Video text detection and localization based on localized generalization error model

XIAN-HENG MA,W. Y. NG,PATRICK P. K. CHAN,DANIEL S. YEUNG,Xian-Heng Ma,Wing W. Y. Ng,Patrick P. K. Chan,Daniel S. Yeung
DOI: https://doi.org/10.1109/icmlc.2010.5580484
2010-07-01
Abstract:Texts in videos provide plenteous information for video analysis such as video indexing, understanding and retrieval. We propose a neural network based method detecting text in the video frames in this work. The proposed method consists of three major steps: feature extraction, text region detection and candidate region refinement. Firstly, we extract texture features from four edge maps yielded from the target video frame. Secondly, a Radial Basis Function Neural Network (RBFNN) optimized by the Localized Generalization Error Model (L-GEM) is applied to detect text candidates. Finally, a false detection of text is applied to fine tune the result. Experimental results demonstrate that the proposed method is efficient for different font-colors, font-sizes and language in complex background.
What problem does this paper attempt to address?