Word Detecting in Document Image Based on Two-Stage Model

Xiujuan Li,Zhimin Huang,Ying Wen,Yue Lu
DOI: https://doi.org/10.1007/978-3-642-34595-1_25
2012-01-01
Abstract:This paper proposes a word detecting method for document image using character models and word models to evaluate the features of single-character and between-character. First, the text line is segmented into several fragments. Second, the candidate character, which is generated by merging some consecutive fragments, will be identified to be the right one if it conforms to the query word character models. Third, the path search strategy is used to search the candidate words constructed with candidate characters. The word model is used to identify the matching cost. Our experimental results on a dataset of document images demonstrate the effectiveness of the proposed method.
What problem does this paper attempt to address?