The METRIC-framework for assessing data quality for trustworthy AI in medicine: a systematic review

Daniel Schwabe,Katinka Becker,Martin Seyferth,Andreas Klaß,Tobias Schäffter
2024-02-21
Abstract:The adoption of machine learning (ML) and, more specifically, deep learning (DL) applications into all major areas of our lives is underway. The development of trustworthy AI is especially important in medicine due to the large implications for patients' lives. While trustworthiness concerns various aspects including ethical, technical and privacy requirements, we focus on the importance of data quality (training/test) in DL. Since data quality dictates the behaviour of ML products, evaluating data quality will play a key part in the regulatory approval of medical AI products. We perform a systematic review following PRISMA guidelines using the databases PubMed and ACM Digital Library. We identify 2362 studies, out of which 62 records fulfil our eligibility criteria. From this literature, we synthesise the existing knowledge on data quality frameworks and combine it with the perspective of ML applications in medicine. As a result, we propose the METRIC-framework, a specialised data quality framework for medical training data comprising 15 awareness dimensions, along which developers of medical ML applications should investigate a dataset. This knowledge helps to reduce biases as a major source of unfairness, increase robustness, facilitate interpretability and thus lays the foundation for trustworthy AI in medicine. Incorporating such systematic assessment of medical datasets into regulatory approval processes has the potential to accelerate the approval of ML products and builds the basis for new standards.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the issue of insufficient data quality assessment in the medical field, particularly concerning data quality for trustworthy artificial intelligence (AI). Specifically, the paper focuses on the following points: 1. **Importance of Data Quality**: Data quality has a decisive impact on the behavior of machine learning (ML) and deep learning (DL) applications. High-quality training data is the foundation for developing trustworthy medical AI products. 2. **Shortcomings of Existing Frameworks**: Existing data quality frameworks, while covering some general data quality dimensions, lack detailed assessment methods specifically for medical data. 3. **Regulatory Requirements**: In the medical field, data quality assessment is crucial for regulatory bodies to approve new AI products. Existing assessment methods are insufficient to meet this need. To address these issues, the paper conducts a systematic review and proposes a data quality framework specifically for medical training data—the METRIC framework. This framework includes 15 awareness dimensions aimed at helping developers, regulatory bodies, and certification organizations comprehensively assess the quality of medical data, thereby enhancing the trustworthiness and safety of medical AI products.