TRMER: Transformer-Based End to End Printed Mathematical Expression Recognition.

Zhaokun Zhou,Shuaijian Ji,Yuqing Wang,Zhenyu Weng,Yuesheng Zhu
DOI: https://doi.org/10.1109/IJCNN54540.2023.10191139
2023-01-01
Abstract:As a fundamental task of transcribing formula images into structural mathematical expressions, Printed Mathematical Expression Recognition (PMER) is wildly used in many fields. However, there is still a lack of an end-to-end approach toward fully exploring the spatial structure and semantic information in the formula to achieve high recognition accuracy. In this work, a Transformer-based Mathematical Expression Recognition (TRMER) model, is proposed to enhance the recognition accuracy. A Dual-Branch Encoder (DBE) is developed to extract multi-scaled feature maps from a formula image so that the spatial and semantic information can be obtained synchronously, and the different feature maps are fused with a Fusion Enhancement Module (FEM) by merging and reinforcing the spatial-semantic information. A standard transformer-based decoder is developed to decode the rich spatial-semantic information of the image and output a recognized mathematical expression in LaTex sequence. The experimental results have illustrated that the TRMER has achieved state-of-the-art recognition performance.
What problem does this paper attempt to address?