Improving the Diproche CNL through Autoformalization via Large Language Models

Merlin Carl
DOI: https://doi.org/10.4204/EPTCS.400.4
2024-04-10
Abstract:The Diproche system is an automated proof checker for texts written in a controlled fragment of German, designed for didactical applications in classes introducing students to proofs for the first time. The first version of the system used a controlled natural language for which a Prolog formalization routine was written. In this paper, we explore the possibility of prompting large language models for autoformalization in the context of Diproche, with encouraging first results.
Logic in Computer Science
What problem does this paper attempt to address?
This paper discusses how to improve the Diproche system, a German controlled natural language proof checker for educational purposes, through automatic formatting using large-scale language models. The current system relies on classical computational linguistics methods, but the authors found that this approach limits expressive power and user experience. The paper proposes leveraging pre-trained language models like DaVinci-3 and GPT-4 for automatic formatting to enhance system flexibility and usability, while simplifying multi-language support. Preliminary experiments indicate promising results in automatic formatting using this approach, as it can handle inputs in different languages and exhibits good tolerance for grammar and spelling errors. However, the paper also points out that the current model still struggles with certain sentence structures and logical connections, necessitating further improvement.