Printed Arabic Document Recognition System

Jianming Jin,Hua Wang,Xiaoqing Ding,Liangrui Peng
DOI: https://doi.org/10.1117/12.585711
2005-01-01
Abstract:As a cursive script, the characteristics of Arabic texts are different from those of Latin or Chinese greatly. For example, an Arabic character has up to four written forms and characters that can be joined are always connected on the baseline. Therefore, the methods used for Arabic document recognition are different from those for Latin and Chinese, where the segmentation of Arabic characters is the most critical problem. In this paper, a printed Arabic document recognition system is presented, which is composed of text line segmentation, word segmentation, character segmentation, character recognition and post-processing stages. Experiment shows that the recognition accuracy of the system has achieved 97.62%.
What problem does this paper attempt to address?