Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach

Vittoriano Muttillo,Claudio Di Sipio,Riccardo Rubei,Luca Berardinelli,MohammadHadi Dehghani

2024-08-26

Abstract:Producing accurate software models is crucial in model-driven software engineering (MDE). However, modeling complex systems is an error-prone task that requires deep application domain knowledge. In the past decade, several automated techniques have been proposed to support academic and industrial practitioners by providing relevant modeling operations. Nevertheless, those techniques require a huge amount of training data that cannot be available due to several factors, e.g., privacy issues. The advent of large language models (LLMs) can support the generation of synthetic data although state-of-the-art approaches are not yet supporting the generation of modeling operations. To fill the gap, we propose a conceptual framework that combines modeling event logs, intelligent modeling assistants, and the generation of modeling operations using LLMs. In particular, the architecture comprises modeling components that help the designer specify the system, record its operation within a graphical modeling environment, and automatically recommend relevant operations. In addition, we generate a completely new dataset of modeling events by telling on the most prominent LLMs currently available. As a proof of concept, we instantiate the proposed framework using a set of existing modeling tools employed in industrial use cases within different European projects. To assess the proposed methodology, we first evaluate the capability of the examined LLMs to generate realistic modeling operations by relying on well-founded distance metrics. Then, we evaluate the recommended operations by considering real-world industrial modeling artifacts. Our findings demonstrate that LLMs can generate modeling events even though the overall accuracy is higher when considering human-based operations.

Software Engineering

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that generating accurate software models in model - driven software engineering (MDE) is a crucial but error - prone task that requires in - depth application domain knowledge. In particular, although automated techniques can provide relevant modeling operations to support practitioners in academia and industry, these techniques require a large amount of training data, which is often difficult to obtain due to various factors (such as privacy issues). Therefore, the paper proposes a conceptual framework aimed at evaluating the ability of large - language models (LLMs) to generate modeling operations, collecting these operations as trajectories, and using them for intelligent modeling assistants (IMAs). Specifically, the paper fills the gaps in existing technologies by combining modeling event logs, intelligent modeling assistants, and using LLMs to generate modeling operations. In addition, the paper also generates a brand - new modeling event data set, uses the currently most prominent LLMs, and instantiates the proposed framework in existing modeling tool instances used in different European projects to prove its concept. The main goal of the paper is to explore how to use LLMs to generate synthetic modeling operation trajectories, thereby supporting the training and optimization of IMAs when there is a lack of trajectories generated by real - user.

Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach

Generative retrieval-augmented ontologic graph and multi-agent strategies for interpretive large language model-based materials design

Generative Retrieval-Augmented Ontologic Graph and Multiagent Strategies for Interpretive Large Language Model-Based Materials Design

Towards Generating Executable Metamorphic Relations Using Large Language Models

Large Language Models for In-Context Student Modeling: Synthesizing Student's Behavior in Visual Programming

Process Modeling With Large Language Models

Generative large language models in engineering design: opportunities and challenges

ORLM: A Customizable Framework in Training Large Models for Automated Optimization Modeling

Control Industrial Automation System with Large Language Models

Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration

MAG-V: A Multi-Agent Framework for Synthetic Data Generation and Verification

Large Language Models for Scholarly Ontology Generation: An Extensive Analysis in the Engineering Field

Large Language Models as Molecular Design Engines

On the use of Large Language Models in Model-Driven Engineering

Large Language Models as Planning Domain Generators

Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making

On the Utility of Domain Modeling Assistance with Large Language Models

LLMs for science: Usage for code generation and data analysis

Solution-oriented Agent-based Models Generation with Verifier-assisted Iterative In-context Learning

Towards Practical Tool Usage for Continually Learning LLMs

Evaluating In-Context Learning of Libraries for Code Generation