Model driven engineering for machine learning components: A systematic literature review
Hira Naveed,Chetan Arora,Hourieh Khalajzadeh,John Grundy,Omar Haggag
DOI: https://doi.org/10.1016/j.infsof.2024.107423
IF: 3.9
2024-02-17
Information and Software Technology
Abstract:Context: Machine Learning (ML) has become widely adopted as a component in many modern software applications. Due to the large volumes of data available, organizations want to increasingly leverage their data to extract meaningful insights and enhance business profitability. ML components enable predictive capabilities, anomaly detection, recommendation, accurate image and text processing, and informed decision-making. However, developing systems with ML components is not trivial; it requires time, effort, knowledge, and expertise in ML, data processing, and software engineering. There have been several studies on the use of model-driven engineering (MDE) techniques to address these challenges when developing traditional software and cyber–physical systems. Recently, there has been a growing interest in applying MDE for systems with ML components. Objective: The goal of this study is to further explore the promising intersection of MDE with ML (MDE4ML) through a systematic literature review (SLR). Through this SLR, we wanted to analyze existing studies, including their motivations, MDE solutions, evaluation techniques, key benefits and limitations. Method: Our SLR is conducted following the well-established guidelines by Kitchenham. We started by devising a protocol and systematically searching seven databases, which resulted in 3934 papers. After iterative filtering, we selected 46 highly relevant primary studies for data extraction, synthesis, and reporting. Results: We analyzed selected studies with respect to several areas of interest and identified the following: (1) the key motivations behind using MDE4ML; (2) a variety of MDE solutions applied, such as modeling languages, model transformations, tool support, targeted ML aspects, contributions and more; (3) the evaluation techniques and metrics used; and (4) the limitations and directions for future work. We also discuss the gaps in existing literature and provide recommendations for future research. Conclusion: This SLR highlights current trends, gaps and future research directions in the field of MDE4ML, benefiting both researchers and practitioners.
computer science, information systems, software engineering