A General Framework For Multi-Character Segmentation And Its Application In Recognizing Multilingual Asian Documents

D Wen,Xq Ding
DOI: https://doi.org/10.1117/12.528951
2004-01-01
Abstract:In this paper we propose a general. framework for character segmentation in complex multilingual documents, which is an endeavor to combine the traditionally separated segmentation and recognition processes into a cooperative system. The framework contains three basic steps: Dissection, Local Optimization and Global Optimization, which are designed to fuse various properties of the segmentation hypotheses hierarchically into a composite evaluation to decide the final recognition results. Experimental results show that this framework is general enough to be applied in variety of documents. A sample system based on. this framework to recognize Chinese, Japanese and Korean documents and experimental performance is reported finally.
What problem does this paper attempt to address?