Scene text extraction based on edges and support vector regression

Shijian Lu,Tao Chen,Shangxuan Tian,Joo-Hwee Lim,Chew-Lim Tan
DOI: https://doi.org/10.1007/s10032-015-0237-z
2015-02-08
International Journal on Document Analysis and Recognition (IJDAR)
Abstract:This paper presents a scene text extraction technique that automatically detects and segments texts from scene images. Three text-specific features are designed over image edges with which a set of candidate text boundaries is first detected. For each detected candidate text boundary, one or more candidate characters are then extracted by using a local threshold that is estimated based on the surrounding image pixels. The real characters and words are finally identified by a support vector regression model that is trained using bags-of-words representation. The proposed technique has been evaluated over the latest ICDAR-2013 Robust Reading Competition dataset. Experiments show that it obtains superior F-measures of 78.19 % and 75.24 % (on atom level), respectively, for the scene text detection and segmentation tasks.
computer science, artificial intelligence
What problem does this paper attempt to address?