Uyghur Character Models with Shared Structure Information for Segmentation-free Recognition under Low Data Resource Conditions

Jiang Zhi-wei,Ding Xiao-qing,Peng Liang-rui,Liu Chang-song
DOI: https://doi.org/10.11999/jeit150019
2015-01-01
Abstract:Although segmentation-free Uyghur character document recognition can efficiently avoid character segmentation error, it does not work well on low-resource new-type samples. This paper suggests sharing stable character structure among different Uyghur fonts, and improves the efficiency of utilizing samples through Bootstrap. Experiments are made on new-type book samples, which contains only 1/5 training sample amount than the original. The average character recognition accuracy of the proposed method on test samples is 95.05%, and has 55.76%~63.84% recognition error rate relative decrease than the one of MaximumA Posteriori (MAP) method. Therefore, the proposed method can accomplish accurate Uyghur character model training under low data resource conditions.
What problem does this paper attempt to address?