Abstract:Formula recognition endeavors to automatically identify mathematical formulas from images. Currently, the Encoder-Decoder model has significantly advanced the translation from image to corresponding formula markups. Nonetheless, previous research primarily concentrated on single-line formula recognition, ignoring the recognition of multi-line formulas, which presents additional challenges such as more stringent grammatical restrictions and two- dimensional positions. In this work, we present GAP (Grammar And Position-Aware formula recognition), a comprehensive framework designed to tackle the challenges in multi-line mathematical formula recognition. First, to overcome the limitations imposed by grammar, we design a novel Grammar Aware Contrastive Learning (GACL) module, integrating complex grammar rules into the transcription model through a contrastive learning mechanism. Furthermore, primitive contrastive learning lacks clear directions for comprehending grammar rules and can lead to unstable convergence or prolonged training cycles. To enhance training efficiency, we propose Rank-Based Sampling (RBS) specialized for multi-line formulas, which guides the learning process by the importance ranking of different grammar errors. Finally, spatial location information is critical considering the two-dimensional nature of multi-line formulas. To aid the model in keeping track of that global information, we introduced a Visual Coverage (VC) mechanism that incorporates historical attention information into the image features via a parameter-free way. To validate the effectiveness of our GAP framework, we construct a new dataset Multi-Line containing 12,002 multi-line formulas and conduct extensive experiments to show the efficacy of our GAP framework in capturing grammatical rules, enhancing recognition accuracy, and enhancing training efficiency. Codes and datasets are available at https://github.com/Sinon02/GAP.

GAP: A Grammar and Position-Aware Framework for Efficient Recognition of Multi-Line Mathematical Formulas.

A Symbol Dominance Based Formulae Recognition Approach For Pdf Documents

Robust Math Formula Recognition in Degraded Chinese Document Images

A Deep Learning-Based Formula Detection Method for Pdf Documents

An End-to-End Formula Recognition Method Integrated Attention Mechanism

Image to LaTeX with Graph Neural Network for Mathematical Formula Recognition

Research Status of Mathematical Formula Recognition

Mathematical Formula Recognition Based on Modified Recursive Projection Profile Cutting and Labeling with Double Linked List

A Text Line Detection Method for Mathematical Formula Recognition

Grammatical Verification for Mathematical Formula Recognition Based on Context-Free Tree Grammar

Neural Mathematical Solver with Enhanced Formula Structure

Mathematical Formula Identification and Performance Evaluation in PDF Documents.

CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation

Research on Mathematical Formula Identification in Digital Chinese Documents

Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition

Multimodal Dependence Attention and Large-Scale Data Based Offline Handwritten Formula Recognition

Symbol Segmentation and Recognition in Online Handwritten Formulas

DGNet: A Handwritten Mathematical Formula Recognition Network Based on Deformable Convolution and Global Context Attention

Mathematical Formula Identification in PDF Documents

Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection

GAPS: Geometry-Aware Problem Solver