Learning Formal Mathematics From Intrinsic Motivation

Gabriel Poesia,David Broman,Nick Haber,Noah D. Goodman
2024-06-30
Abstract:How did humanity coax mathematics from the aether? We explore the Platonic view that mathematics can be discovered from its axioms - a game of conjecture and proof. We describe Minimo (Mathematics from Intrinsic Motivation): an agent that jointly learns to pose challenging problems for itself (conjecturing) and solve them (theorem proving). Given a mathematical domain axiomatized in dependent type theory, we first combine methods for constrained decoding and type-directed synthesis to sample valid conjectures from a language model. Our method guarantees well-formed conjectures by construction, even as we start with a randomly initialized model. We use the same model to represent a policy and value function for guiding proof search. Our agent targets generating hard but provable conjectures - a moving target, since its own theorem proving ability also improves as it trains. We propose novel methods for hindsight relabeling on proof search trees to significantly improve the agent's sample efficiency in both tasks. Experiments on 3 axiomatic domains (propositional logic, arithmetic and group theory) demonstrate that our agent can bootstrap from only the axioms, self-improving in generating true and challenging conjectures and in finding proofs.
Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to perform mathematical reasoning autonomously through artificial intelligence (AI). In particular, given axioms, let AI agents be able to automatically generate challenging but provable conjectures and find proofs for these conjectures. Specifically, the goals of the paper are: 1. **Generate valid mathematical conjectures**: Without prior knowledge, start from the axioms in dependent type theory and train an agent to automatically generate valid mathematical propositions (conjectures). This involves how to ensure that the generated conjectures are reasonable, meaningful, and conform to mathematical logic. 2. **Improve theorem - proving ability**: Let the agent be able to not only generate conjectures but also attempt to prove these conjectures. As the training progresses, the agent should be able to gradually improve its theorem - proving ability, thereby solving more complex and challenging mathematical problems. 3. **Use intrinsic motivation for self - improvement**: By introducing an intrinsic motivation mechanism, enable the agent to continuously explore and learn without external rewards. This means that the agent will adjust the difficulty of the generated conjectures according to its own progress to maintain an appropriate level of challenge and keep improving. 4. **Address the sparse - reward problem**: In theorem - proving tasks, successful proofs are relatively scarce. Therefore, the paper proposes a method of "hindsight relabeling", reinterpreting failed proof paths as successful paths, thereby accelerating the learning process and generating more training data. ### Main contributions of the paper - **Proposed the MINIMO framework**: Combined the ideas of language models and reinforcement learning, and realized automatic mathematical reasoning starting from axioms through an intrinsically - motivated cyclic learning process. - **Defined a new conjecture - generation method**: Used constrained decoding and type - guided synthesis techniques to ensure that the generated conjectures are valid and can be adjusted according to difficulty. - **Introduced the hindsight relabeling technique**: Significantly improved sample efficiency, making it possible to extract useful training data even in failed proof searches. - **Verified the effectiveness of the system**: Conducted experiments in three different mathematical fields (propositional logic, arithmetic, and group theory), demonstrating the agent's self - improvement ability in generating valid conjectures and proving theorems. Through these methods, the paper aims to bridge the gap between existing mathematical reasoning systems and human mathematicians and promote the further development of AI in the mathematical field.