Peano: Learning Formal Mathematical Reasoning

Gabriel Poesia,Noah D. Goodman
DOI: https://doi.org/10.1098/rsta.2022.0044
2022-11-29
Abstract:General mathematical reasoning is computationally undecidable, but humans routinely solve new problems. Moreover, discoveries developed over centuries are taught to subsequent generations quickly. What structure enables this, and how might that inform automated mathematical reasoning? We posit that central to both puzzles is the structure of procedural abstractions underlying mathematics. We explore this idea in a case study on 5 sections of beginning algebra on the Khan Academy platform. To define a computational foundation, we introduce Peano, a theorem-proving environment where the set of valid actions at any point is finite. We use Peano to formalize introductory algebra problems and axioms, obtaining well-defined search problems. We observe existing reinforcement learning methods for symbolic reasoning to be insufficient to solve harder problems. Adding the ability to induce reusable abstractions ("tactics") from its own solutions allows an agent to make steady progress, solving all problems. Furthermore, these abstractions induce an order to the problems, seen at random during training. The recovered order has significant agreement with the expert-designed Khan Academy curriculum, and second-generation agents trained on the recovered curriculum learn significantly faster. These results illustrate the synergistic role of abstractions and curricula in the cultural transmission of mathematics.
Artificial Intelligence
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **How can a computer solve new mathematical problems like a human, especially in terms of mathematical reasoning?** Specifically, the author explores two main issues: 1. **The computational undecidability of mathematical reasoning**: Although the provability of general mathematical propositions is computationally undecidable (according to the work of Church and Turing), humans can routinely solve new mathematical problems. Moreover, new - generation mathematicians can master the mathematical knowledge developed by their predecessors over hundreds of years in a relatively short time. What kind of structure lies behind this ability? 2. **The construction of an automated mathematical reasoning system**: How can a system be designed so that a computer can automatically solve mathematical problems in elementary algebra with minimal prior knowledge? In particular, how can the performance of the system be improved through reinforcement learning and abstract patterns (such as "tactics") so that it can gradually solve more complex problems? To solve these problems, the author proposes the following methods: - **Introducing the Peano language**: This is a theorem - proving language based on dependent types, aiming to provide a limited action space so that all possible valid steps can be enumerated when searching for solutions. - **Formalizing Khan Academy's algebra courses**: Formalize the five - part algebra courses into problems and axioms in the Peano language, thereby obtaining a series of well - defined search problems. - **Applying reinforcement learning and tactic induction**: Use the reinforcement learning method to train agents to solve these formalized problems, and by inducing higher - level abstract actions (i.e., "tactics") from existing solutions, help the agents gradually solve more difficult problems. - **Reconstructing and optimizing the syllabus**: Analyze the tactics learned by the agents and their dependencies, reconstruct a syllabus similar to Khan Academy's, and verify its impact on subsequent learning. Through these methods, the author demonstrates **the synergy between abstract patterns (tactics) and syllabuses (curriculum) in the transfer of mathematical knowledge**, and proposes a new perspective for understanding how humans efficiently impart mathematical knowledge. ### Summary The main contributions of this paper include: - Proposing the Peano language, a theorem - proving environment with a limited action space. - Formalizing Khan Academy's algebra courses as problems in the Peano language and demonstrating the challenges these problems pose to reinforcement learning agents. - Enabling agents to gradually solve all problems through the tactic induction algorithm. - Observing that the tactics learned by agents can be used to reconstruct Khan Academy's syllabus, and agents trained based on this syllabus learn faster. These results reveal **the important role of abstract patterns and syllabuses in the inheritance of mathematical culture**.