FormalGeo: The First Step Toward Human-like IMO-level Geometric
Automated Reasoning
Xiaokai Zhang,Na Zhu,Yuan He,Jia Zhang,Qing Huang,Xiaoxiao Jin,Youguang Guo,Chenyang Mao,Zhe Zhu,DengāFeng Yue,Fei Zhu,Yang Li,Yifan Wang,Yiwen Huang,Runan Wang,Cheng Qian,Zhenbing Zeng,Shaorong Xie,Xiapu Luo,Tuo Leng
DOI: https://doi.org/10.48550/arxiv.2310.18021
2023-01-01
Abstract:This is the first paper in a series of work we have accomplished over the past three years. In this paper, we have constructed a consistent formal plane geometry system. This will serve as a crucial bridge between IMO-level plane geometry challenges and readable AI automated reasoning. Within this formal framework, we have been able to seamlessly integrate modern AI models with our formal system. AI is now capable of providing deductive reasoning solutions to IMO-level plane geometry problems, just like handling other natural languages, and these proofs are readable, traceable, and verifiable. We propose the geometry formalization theory (GFT) to guide the development of the geometry formal system. Based on the GFT, we have established the FormalGeo, which consists of 88 geometric predicates and 196 theorems. It can represent, validate, and solve IMO-level geometry problems. we also have crafted the FGPS (formal geometry problem solver) in Python. It serves as both an interactive assistant for verifying problem-solving processes and an automated problem solver. We've annotated the formalgeo7k and formalgeo-imo datasets. The former contains 6,981 (expand to 133,818 through data augmentation) geometry problems, while the latter includes 18 (expand to 2,627 and continuously increasing) IMO-level challenging geometry problems. All annotated problems include detailed formal language descriptions and solutions. Implementation of the formal system and experiments validate the correctness and utility of the GFT. The backward depth-first search method only yields a 2.42% problem-solving failure rate, and we can incorporate deep learning techniques to achieve lower one. The source code of FGPS and datasets are available at https://github.com/BitSecret/FGPS.