Enhancing Complex Formula Recognition with Hierarchical Detail-Focused Network

Jiale Wang,Junhui Yu,Huanyong Liu,Chenanran Kong
2024-09-18
Abstract:Hierarchical and complex Mathematical Expression Recognition (MER) is challenging due to multiple possible interpretations of a formula, complicating both parsing and evaluation. In this paper, we introduce the Hierarchical Detail-Focused Recognition dataset (HDR), the first dataset specifically designed to address these issues. It consists of a large-scale training set, HDR-100M, offering an unprecedented scale and diversity with one hundred million training instances. And the test set, HDR-Test, includes multiple interpretations of complex hierarchical formulas for comprehensive model performance evaluation. Additionally, the parsing of complex formulas often suffers from errors in fine-grained details. To address this, we propose the Hierarchical Detail-Focused Recognition Network (HDNet), an innovative framework that incorporates a hierarchical sub-formula module, focusing on the precise handling of formula details, thereby significantly enhancing MER performance. Experimental results demonstrate that HDNet outperforms existing MER models across various datasets.
Computation and Language
What problem does this paper attempt to address?
This paper aims to address the challenges in Mathematical Expression Recognition (MER), especially the problem of formula recognition for multi - level and complex structures. Specifically, current MER models have the following main problems when dealing with highly complex mathematical expressions: 1. **Limitations of the dataset**: Existing datasets lack mathematical expressions with highly complex structures, which restricts the model's training and performance improvement in these aspects. 2. **Insufficient detail capture**: Complex mathematical formulas contain many subtle details, and existing models are often unable to accurately capture these details, resulting in parsing errors. To solve these problems, the paper proposes the following innovations: - **HDR dataset**: This is a large - scale, multi - label MER dataset, containing 100 million training instances (HDR - 100M) and a test set (HDR - Test). The latter covers expressions of various complexities and supports multiple valid interpretations. - **HDNet model**: This is a new MER framework based on an encoder - decoder structure. It introduces a hierarchical sub - formula module, focuses on accurately processing formula details, and significantly improves the performance of MER. - **Fair evaluation method**: An improved evaluation method is proposed, which takes into account the functional equivalence of formulas and ensures a more fair comparison of model performance. Through these innovations, the paper aims to improve the recognition accuracy and robustness of complex mathematical expressions.