New method for text detection and segmentation from complex images

Fang Liu,Xiang Peng,Tianjiang Wang
DOI: https://doi.org/10.1117/12.750055
2007-01-01
Abstract:Textual information contained in images is a valuable source of high-level semantics for image indexing and retrieval. This paper proposes a new method to detect and segment text from complex images. First, a density-based clustering method is employed to discover the candidate text regions. The clustering method is from data mining area. It computes the density distribution of overall image and makes spatial connective pixels with similar color/grayscale into one region. The clustered regions are deemed as candidate text regions. Then simple heuristics are applied to delete those obvious non-text regions from the candidate. But there still exits a few non-text regions in the candidate. Therefore a texture-based method is used to select text regions from the filtered candidate text regions. Considering the time complexity of density computation in clustering step, an approximate algorithm is designed to improve the efficiency. Experimental result shows the method is robust to variations in text font, orientation, language, and size.
What problem does this paper attempt to address?