On the Utility of Domain Modeling Assistance with Large Language Models

Meriem Ben Chaaben,Lola Burgueño,Istvan David,Houari Sahraoui
2024-10-16
Abstract:Model-driven engineering (MDE) simplifies software development through abstraction, yet challenges such as time constraints, incomplete domain understanding, and adherence to syntactic constraints hinder the design process. This paper presents a study to evaluate the usefulness of a novel approach utilizing large language models (LLMs) and few-shot prompt learning to assist in domain modeling. The aim of this approach is to overcome the need for extensive training of AI-based completion models on scarce domain-specific datasets and to offer versatile support for various modeling activities, providing valuable recommendations to software modelers. To support this approach, we developed MAGDA, a user-friendly tool, through which we conduct a user study and assess the real-world applicability of our approach in the context of domain modeling, offering valuable insights into its usability and effectiveness.
Software Engineering,Artificial Intelligence,Human-Computer Interaction
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to use large - language models (LLMs) and few - shot prompt learning to assist domain modeling, in order to overcome the challenges of existing methods in terms of scarce training data, incomplete domain understanding, and grammatical constraints**. Specifically, the authors focus on domain modeling in software development, which is a complex and error - prone task. Domain modeling requires a combination of professional knowledge and modeling formalism, which makes it a challenge for both domain experts and software professionals. In addition, the domain itself is an open world, and determining the boundaries of what should be included in the model is very context - dependent. These challenges prompt researchers to look for new tools and techniques to assist modelers in completing their tasks. To this end, the paper proposes a new method based on large - language models, aiming at: 1. **Reducing the need for a large amount of domain - specific data sets**: Traditional methods usually require a large amount of domain - specific data to train AI models, while this method reduces this need through the use of few - shot prompt learning. 2. **Providing flexible support**: This method can support various modeling activities and provide suggestions for software modelers. 3. **Evaluating the effectiveness in practical applications**: In order to verify the practical effect of this method, the authors developed a user - friendly tool MAGDA and evaluated its applicability and effectiveness in domain modeling through user studies. In summary, the main objective of this paper is to explore the practicality of large - language models in domain modeling, especially their role in improving productivity, contribution, and creativity, and to answer the following research questions through empirical research: - **Objective utility**: - RQ1 (Productivity): What is the impact of modeling assistance on the time required to complete the domain model? - RQ2 (Contribution): How much does the suggestion contribute to the final model? - RQ3 (Creativity): Does the assistance reduce the diversity of models generated by different modelers in the same domain? - **Perceived utility**: - RQ4: Which type of assistance do participants prefer to choose when completing the task? - RQ5: How do modelers view the impact of assistance on the modeling experience? Through the answers to these questions, the paper hopes to provide a more efficient and practical assistance tool for domain modeling.