Machine learning experiment management tools: a mixed-methods empirical study

Samuel Idowu,Osman Osman,Daniel Strüber,Thorsten Berger
DOI: https://doi.org/10.1007/s10664-024-10444-w
IF: 3.762
2024-05-30
Empirical Software Engineering
Abstract:Machine Learning (ML) experiment management tools support ML practitioners and software engineers when building intelligent software systems. By managing large numbers of ML experiments comprising many different ML assets, they not only facilitate engineering ML models and ML-enabled systems, but also managing their evolution—for instance, tracing system behavior to concrete experiments when the model performance drifts. However, while ML experiment management tools have become increasingly popular, little is known about their effectiveness in practice, as well as their actual benefits and challenges. We present a mixed-methods empirical study of experiment management tools and the support they provide to users. First, our survey of 81 ML practitioners sought to determine the benefits and challenges of ML experiment management and of the existing tool landscape. Second, a controlled experiment with 15 student developers investigated the effectiveness of ML experiment management tools. We learned that 70% of our survey respondents perform ML experiments using specialized tools, while out of those who do not use such tools, 52% are unaware of experiment management tools or of their benefits. The controlled experiment showed that experiment management tools offer valuable support to users to systematically track and retrieve ML assets. Using ML experiment management tools reduced error rates and increased completion rates. By presenting a user's perspective on experiment management tools, and the first controlled experiment in this area, we hope that our results foster the adoption of these tools in practice, as well as they direct tool builders and researchers to improve the tool landscape overall.
computer science, software engineering
What problem does this paper attempt to address?
This paper focuses on the effectiveness, benefits, and challenges of machine learning (ML) experiment management tools in practical applications. The research conducted empirical studies using mixed methods, including surveys of 81 ML practitioners and controlled experiments with 15 student developers. The survey revealed that approximately 70% of respondents use dedicated tools for ML experiments, while among those who do not use such tools, 52% are unaware of or do not know the benefits of experiment management tools. The controlled experiments showed that using ML experiment management tools can reduce error rates, improve completion rates, and help users systematically track and retrieve ML assets. The study found that despite the growing popularity of ML experiment management tools, there is little knowledge about their effectiveness and specific benefits in practice. These tools are designed to support the development of ML models and intelligent software systems, managing a large number of experiments and the various assets involved, including datasets, models, code, parameters, etc. However, traditional version control systems are not fully suitable for ML development as they cannot provide the appropriate level of abstraction required for exploring project history. The paper emphasizes the crucial role of experiment management tools in version tracking, traceability, auditability, reproducibility, and collaboration to support users in comparing different experiment iterations and answering factual questions about ongoing or completed experiment assets. The researchers collected data through surveys and experiments to understand the challenges faced by users, the support provided by the tools, and the actual benefits. The goal of the paper is to promote the adoption of these tools in practice, provide insights for researchers and tool developers to improve the tools, and offer recommendations for educators to train software engineers in building ML-driven systems using these tools. In this way, they hope to drive the development of ML experiment management tools, making them more effective and increasing their application in the industry.