Manchu Word Recognition Based on Convolutional Neural Network with Spatial Pyramid Pooling

Min Li,Ruirui Zheng,Shuang Xu,Yu Fu,Di Huang
DOI: https://doi.org/10.1109/cisp-bmei.2018.8633131
2018-01-01
Abstract:Manchu character recognition is important in protecting and researching Manchu culture and history. Previous methods of Manchu character recognition are mainly based on conventional machine learning using shallow artificial selection features, thus recognition results are unsatisfactory. The method with convolutional neural networks achieves high accuracy on optical character recognition as the convolution operators can automatically extract deep structure features. The convolutional neural network needs input images with the fixed size, but as a kind of phonemic language, the Manchu word has an arbitrary length. So it is needed to normalize the size of images if applying conventional convolutional neural network directly on Manchu word recognition. This normalization process will restrain the promotion of Manchu character recognition accuracy. This paper utilizes the spatial pyramid pooling layer instead of the last max-pooling layer in a convolutional neural network, and proposes a classifier for recognizing the arbitrary size Manchu word without segmenting the word. Without need of normalizing image sizes, the proposed model obtains the better recognition accuracy. The experiments indicate that the proposed Manchu word recognition models achieve the highest accuracy of 0.9768, higher than the conventional convolutional neural network. Furthermore there is no normalization on input images with arbitrary sizes in recognizing process. The proposed Manchu word recognition models outperform conventional counterparts in both accuracy and flexibility.
What problem does this paper attempt to address?