Training LLMs for Generating IEC 61131-3 Structured Text with Online Feedback

Aaron Haag,Bertram Fuchs,Altay Kacan,Oliver Lohse
2024-10-30
Abstract:The advent of large language models (LLMs), such as GPT-4, has enabled significant advancements in generating code across various domains. However, these models face unique challenges when generating IEC 61131-3 Structured Text (ST) code due to limited data in public training datasets and the complexity of ST language syntax. This paper proposes a novel approach to training LLMs that emphasizes improving the quality of learning data through an online process involving compiler feedback and evaluation from a secondary LLM. In this framework, the primary LLM generates new training samples, which are subsequently evaluated by a compiler for syntactical correctness and by a specialized LLM that excels at assessing semantic accuracy, though it is not optimized for code generation itself. Through iterative refinement of the training data, this approach results in marked improvements for the trained LLM, leading to higher compilation success rates and better semantic precision. As a result, the framework proves highly suitable for industrial automation applications and outperforms state-of-the-art models.
Software Engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to improve the quality of Structured Text (ST) code generated by large - language models (LLMs) that complies with the IEC 61131 - 3 standard. Specifically, existing LLMs face the following challenges when generating ST code: 1. **Data Scarcity**: There is a lack of sufficient ST code data in public training datasets. 2. **Grammar Complexity**: The grammar of the ST language is relatively complex, and it is difficult for existing LLMs to fully master its grammar rules. 3. **Semantic Accuracy**: The generated code not only needs to be grammatically correct but also logically accurate to ensure that its function implementation meets expectations. To solve these problems, the paper proposes a new training framework to improve the quality of LLMs' training data through compiler feedback and specialized LLM evaluation. This method aims to generate higher - quality ST code through iterative optimization, thereby increasing the compilation success rate and semantic accuracy, and is suitable for industrial automation applications. ### Specific Problem Description - **Data Scarcity Problem**: Due to the particularity of the PLC programming field, the existing public datasets are not sufficient to support the effective training of LLMs, resulting in low - quality generated code. - **Grammar and Semantic Accuracy Problems**: ST code not only requires correct grammar but also needs to be logically in line with the expected function, which places higher requirements on LLMs. - **Limitations of Existing Methods**: Traditional human - feedback - based methods are costly and inefficient and cannot meet the needs of large - scale code generation. ### Solution The solutions proposed in the paper include the following aspects: 1. **Compiler Feedback**: Use a compiler to perform grammar checks on the generated code to ensure the grammar correctness of the code. 2. **Semantic Evaluation**: Introduce a specialized LLM as a semantic expert to evaluate the logical correctness of the generated code. 3. **Iterative Optimization**: Continuously optimize the training data in an iterative manner to gradually improve the quality of the generated code. 4. **Online Feedback Mechanism**: Combine the DPO (Direct Preference Optimization) method and use a real - time feedback mechanism to dynamically adjust model parameters to improve model performance. ### Goal The ultimate goal is to create a robust and scalable solution that overcomes the limitations of traditional human feedback and static datasets, enabling LLMs to generate high - quality, industry - standard - compliant ST code, thereby better serving the industrial automation field. Through this method, the paper shows significant improvements, especially in terms of compilation success rate and semantic accuracy, surpassing the existing state - of - the - art models.