An Improved Approach Based on CNN-RNNs for Mathematical Expression Recognition

Wei Zhang,Zhiqiang Bai,Yuesheng Zhu
DOI: https://doi.org/10.1145/3330393.3330410
2019-01-01
Abstract:Mathematical expression recognition (MER) in images is a challenging task due to formula symbol recognition and structured analysis. Optical character recognition (OCR) has been used in natural language recognition and many areas. However, it is difficult for OCR to recognize some special formula symbols and accurately confirm their positions in MER. In this paper, an improved end-to-end MER approach based on CNN-RNNs (convolutional neural network - recurrent neural networks) is proposed to optimize the processing for the formula symbol recognition and localization. In our proposed approach, we extract the mathematical expression features in the images by CNN and generate the mathematical expression by RNNs. In order to improve the acquisition of fuzzy or small symbolic feature information and get an accurate symbol position, we double the source images and extract image features with preprocessed images by CNN. And we add a double-attention mechanism between the encoder and decoder so that the symbol position can be obtained accurately. In addition, to prevent overfitting, a dropout layer is introduced to improve the generalization of the MER model. We do the experiment on IM2LATEX-100K dataset and obtain the comparative result. A BLEU accuracy of 88.42% is achieved, which is better than other methods.
What problem does this paper attempt to address?