Abstract:How did humanity coax mathematics from the aether? We explore the Platonic view that mathematics can be discovered from its axioms - a game of conjecture and proof. We describe Minimo (Mathematics from Intrinsic Motivation): an agent that jointly learns to pose challenging problems for itself (conjecturing) and solve them (theorem proving). Given a mathematical domain axiomatized in dependent type theory, we first combine methods for constrained decoding and type-directed synthesis to sample valid conjectures from a language model. Our method guarantees well-formed conjectures by construction, even as we start with a randomly initialized model. We use the same model to represent a policy and value function for guiding proof search. Our agent targets generating hard but provable conjectures - a moving target, since its own theorem proving ability also improves as it trains. We propose novel methods for hindsight relabeling on proof search trees to significantly improve the agent's sample efficiency in both tasks. Experiments on 3 axiomatic domains (propositional logic, arithmetic and group theory) demonstrate that our agent can bootstrap from only the axioms, self-improving in generating true and challenging conjectures and in finding proofs.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to perform mathematical reasoning autonomously through artificial intelligence (AI). In particular, given axioms, let AI agents be able to automatically generate challenging but provable conjectures and find proofs for these conjectures. Specifically, the goals of the paper are: 1. **Generate valid mathematical conjectures**: Without prior knowledge, start from the axioms in dependent type theory and train an agent to automatically generate valid mathematical propositions (conjectures). This involves how to ensure that the generated conjectures are reasonable, meaningful, and conform to mathematical logic. 2. **Improve theorem - proving ability**: Let the agent be able to not only generate conjectures but also attempt to prove these conjectures. As the training progresses, the agent should be able to gradually improve its theorem - proving ability, thereby solving more complex and challenging mathematical problems. 3. **Use intrinsic motivation for self - improvement**: By introducing an intrinsic motivation mechanism, enable the agent to continuously explore and learn without external rewards. This means that the agent will adjust the difficulty of the generated conjectures according to its own progress to maintain an appropriate level of challenge and keep improving. 4. **Address the sparse - reward problem**: In theorem - proving tasks, successful proofs are relatively scarce. Therefore, the paper proposes a method of "hindsight relabeling", reinterpreting failed proof paths as successful paths, thereby accelerating the learning process and generating more training data. ### Main contributions of the paper - **Proposed the MINIMO framework**: Combined the ideas of language models and reinforcement learning, and realized automatic mathematical reasoning starting from axioms through an intrinsically - motivated cyclic learning process. - **Defined a new conjecture - generation method**: Used constrained decoding and type - guided synthesis techniques to ensure that the generated conjectures are valid and can be adjusted according to difficulty. - **Introduced the hindsight relabeling technique**: Significantly improved sample efficiency, making it possible to extract useful training data even in failed proof searches. - **Verified the effectiveness of the system**: Conducted experiments in three different mathematical fields (propositional logic, arithmetic, and group theory), demonstrating the agent's self - improvement ability in generating valid conjectures and proving theorems. Through these methods, the paper aims to bridge the gap between existing mathematical reasoning systems and human mathematicians and promote the further development of AI in the mathematical field.

Learning Formal Mathematics From Intrinsic Motivation

Peano: Learning Formal Mathematical Reasoning

LeanAgent: Lifelong Learning for Formal Theorem Proving

Advancing mathematics by guiding human intuition with AI

Fail better: What formalized math can teach us about learning

Is mathematics a game?

Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From A Psychological Perspective

Formal Mathematics Statement Curriculum Learning

Towards Intuitive Reasoning in Axiomatic Geometry

Towards a Mathematics Formalisation Assistant using Large Language Models

LEMMA: Bootstrapping High-Level Mathematical Reasoning with Learned Symbolic Abstractions

Introduction to Mathematical Language Processing: Informal Proofs, Word Problems, and Supporting Tasks

Learning $\textit{Ex Nihilo}$

Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of Large Language Models with Misconceptions

Hard Proofs and Good Reasons

A memory theoretic approach for investigating the roles of language and intuition in mathematical thinking activities

A Reinforcement Learning Environment for Mathematical Reasoning via Program Synthesis

Mathematical Formalized Problem Solving and Theorem Proving in Different Fields in Lean 4

Motif: Intrinsic Motivation from Artificial Intelligence Feedback

Automating the Generation of High School Geometry Proofs using Prolog in an Educational Context

Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology