Relative Position Embedding Asymmetric Siamese Network for Offline Handwritten Mathematical Expression recognition.
Chunyi Wang,Hurunqi Luo,Xiaqing Rao,Wei Hu,Ning Bi,Jun Tan
DOI: https://doi.org/10.1007/978-3-031-41676-7_7
2023-01-01
Abstract:Currently, Recurrent Neural Network(RNN)-based encoder-decoder models are widely used in handwritten mathematical expression recognition (HMER). Due to its recursive pattern, the problem of gradient disappearance or gradient explosion also exists for RNN, which makes them inefficient in processing long HME sequences. In order to solve above problems, this paper proposes a Transformer-based encoder-decoder model consisting of an asymmetric siamese network, relative position embedding Transformer (ASNRT). With the assistance of printed images, the asymmetric siamese network further narrows the difference betweeen feature maps of similar formula images and increases the encoding gap between dissimilar formula images. We insert coordinate attention into the encoder, additionally we replace RNN with Transformer as the decoder. Moreover, rotary position embedding is used, incorporating relative position information through absolute embedding ways. Given the symmetry of MEs, we adopt the bidirectional decoding strategy. Extensive experiments show that our model improves the ExpRate of state-of-the-art methods on CROHME 2014, CROHME 2016, and CROHME 2019 by 0.94 % , 2.18 % and 2.12 % , respectively.