Abstract:Math formulas (e.g., "distance = speed X time'') serve as one of the fundamental commonsense knowledge in human cognition, where humans naturally acquire and manipulate them in logical thinking for mathematical reasoning problems. However, existing reasoning models mainly focus on learning heuristic linguistics or patterns to generate answers, but do not pay enough attention on learning with such formula knowledge. Thus, they are not transparent (thus uninterpretable) in terms of understanding and grasping basic mathematical logic. In this paper, to promote a step forward in the domain, we first construct two datasets (Math23K-F and MAWPS-F) with precise annotations of formula usage in each reasoning step for math word problems. Especially, our datasets are refined on the benchmark datasets, and thus ensure the generality and comparability for relevant research. Then, we propose a novel Formula-mastered Solver (FOMAS) with the guidance of mastering formula knowledge to solve the problems. Specifically, we establish FOMAS with two systems drawing insight from the dual process theory, including a Knowledge System and a Reasoning System, to learn and apply formula knowledge, respectively. The Knowledge System accumulates the math formulas, where we propose a novel pretraining manner to mimic how humans grasp the mathematical logic behind them. Then, in the Reasoning System, we develop elaborate formula-guided symbol prediction and goal generation methods that retrieve the necessary formula knowledge from Knowledge System to improve both reasoning accuracy and interpretability. It organically simulates how humans conduct complex reasoning under the explicit instruction of math formulas. Experimental results prove that FOMAS has a stronger reasoning ability and achieves a more interpretable reasoning process, which verifies the necessity of introducing formula knowledge transparently.

Effects of context, complexity, and clustering on evaluation for math formula retrieval

Formula Citation Graph Based Mathematical Information Retrieval

Formula Ranking Within an Article.

Combining Text and Formula Queries in Math Information Retrieval: Evaluation of Query Results Merging Strategies

A Mathematics Retrieval System for Formulae in Layout Presentations

The Effectiveness of Graph Contrastive Learning on Mathematical Information Retrieval

The Tangent Search Engine: Improved Similarity Metrics and Scalability for Math Formula Search

A Mathematical Information Retrieval System Based On Rankboost

Mathematical Information Retrieval Trends and Techniques

ICST Math Retrieval System for NTCIR-11 Math-2 Task.

Discovery and Recognition of Formula Concepts using Machine Learning

Preliminary Exploration of Formula Embedding for Mathematical Information Retrieval: can mathematical formulae be embedded like a natural language?

Which one is better: presentation-based or content-based math search?

MathIRs: Retrieval System for Scientific Documents

A Study of PHOC Spatial Region Configurations for Math Formula Retrieval

Wikimirs 3.0: A Hybrid Mir System Based On The Context, Structure And Importance Of Formulae In A Document

Evaluating Token-Level and Passage-Level Dense Retrieval Models for Math Information Retrieval

Discovering Mathematical Objects of Interest -- A Study of Mathematical Notations

Guiding Mathematical Reasoning Via Mastering Commonsense Formula Knowledge

MathBERT: A Pre-Trained Model for Mathematical Formula Understanding

Research Status of Mathematical Formula Recognition