Status Quo and Problems of Requirements Engineering for Machine Learning: Results from an International Survey

Antonio Pedro Santos Alves,Marcos Kalinowski,Görkem Giray,Daniel Mendez,Niklas Lavesson,Kelly Azevedo,Hugo Villamizar,Tatiana Escovedo,Helio Lopes,Stefan Biffl,Jürgen Musil,Michael Felderer,Stefan Wagner,Teresa Baldassarre,Tony Gorschek
DOI: https://doi.org/10.48550/arXiv.2310.06726
2023-10-10
Software Engineering
Abstract:Systems that use Machine Learning (ML) have become commonplace for companies that want to improve their products and processes. Literature suggests that Requirements Engineering (RE) can help address many problems when engineering ML-enabled systems. However, the state of empirical evidence on how RE is applied in practice in the context of ML-enabled systems is mainly dominated by isolated case studies with limited generalizability. We conducted an international survey to gather practitioner insights into the status quo and problems of RE in ML-enabled systems. We gathered 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative analyses on the reported problems involving open and axial coding procedures. We found significant differences in RE practices within ML projects. For instance, (i) RE-related activities are mostly conducted by project leaders and data scientists, (ii) the prevalent requirements documentation format concerns interactive Notebooks, (iii) the main focus of non-functional requirements includes data quality, model reliability, and model explainability, and (iv) main challenges include managing customer expectations and aligning requirements with data. The qualitative analyses revealed that practitioners face problems related to lack of business domain understanding, unclear goals and requirements, low customer engagement, and communication issues. These results help to provide a better understanding of the adopted practices and of which problems exist in practical environments. We put forward the need to adapt further and disseminate RE-related practices for engineering ML-enabled systems.
What problem does this paper attempt to address?