End-to-End Model Based on Bidirectional LSTM and CTC for Segmentation-free Traditional Mongolian Recognition

Weiyuan Wang,Hongxi Wei,Hui Zhang
DOI: https://doi.org/10.23919/chicc.2019.8866073
2019-01-01
Abstract:A large number of Mongolian books and documents need to be stored and processed by computer. Optical Character Recognition (OCR) can convert a scanned document into text. However, the existing Mongolian OCR systems are realized by a glyph segmentation scheme. However, the glyph segmentation is much more difficult for several fonts. In this study, a segmentation-free approach based on end-to-end model is proposed for traditional Mongolian word recognition. In particular. the proposed model can extract features directly from the input word images and generate the corresponding recognition results (i.e. a sequence of letters). Experimental results demonstrate that the proposed end-to-end model outperforms the segmentation-based method. Moreover, the proposed model can solve the problem of out-of-vocabulary as well.
What problem does this paper attempt to address?