Symbol Location-Aware Network for Improving Handwritten Mathematical Expression Recognition

Yingnan Fu,Wenyuan Cai,Ming Gao,Aoying Zhou
DOI: https://doi.org/10.1145/3591106.3592259
2023-01-01
Abstract:Recently most handwritten mathematical expression recognition methods adopt the attention-based encoder-decoder framework, which generates LaTeX sequences from given images. However, the accuracy of the attention mechanism limits the performance of HMER models. Lacking global context information in the decoding process is also a challenge for HMER. Some methods adopt symbol-level counting to localize symbols for improving the model performance, while these methods cannot work well. In this paper, we propose a method named SLAN, shorted for a Symbol Location-Aware Network, to solve the HMER problem. Specifically, we propose an advanced relation-level counting method to detect symbols in the image. We solve the lacking global context problem with a new global context-aware decoder. For improving the accuracy of attention, we design a novel attention alignment loss function by the dynamic programming algorithm, which can learn attention alignment directly without pixel-level labels. We conducted extensive experiments on the CROHME dataset to demonstrate the effectiveness of each part of SLAN and achieved state-of-the-art performance.
What problem does this paper attempt to address?