Comparison of AI-integrated pathways with human-AI interaction for population mammographic screening

Helen ML Frazer,Carlos A Pena-Solorzano,Chun Fung Kwok,Michael Elliott,Yuanhong Chen,Chong Wang,the BRAIx team,Jocelyn Lippey,John Hopper,Peter Brotchie,Gustavo Carneiro,Davis J McCarthy
DOI: https://doi.org/10.1101/2022.11.23.22282646
2024-05-14
Abstract:Artificial intelligence (AI) holds promise for improving breast cancer screening, but many challenges remain in implementing AI tools in clinical screening services. AI readers compare favourably against individual human radiologists in detecting breast cancer in population screening programs. However, single AI or human readers cannot perform at the level of multi-reader systems such as those used in Australia, Sweden, the UK, and other countries. The implementation of AI readers in mammographic screening programs therefore demands integration of AI readers in multi-reader systems featuring collaboration between humans and AI. Successful integration of AI readers demands a better understanding of possible models of human-AI collaboration and exploration of the range of possible outcomes engendered by the effects on human readers of interacting with AI readers. Here, we used a large, high-quality retrospective mammography dataset from Victoria, Australia to conduct detailed simulations of five plausible AI-integrated screening pathways. We compared the performance of these AI-integrated pathways against the baseline standard-of-care "two reader plus third arbitration" system used in Australia. We examined the influence of positive, neutral, and negative human-AI interaction effects of varying strength to explore possibilities for upside, automation bias, and downside risk of human-AI collaboration. Replacing the second reader or allowing the AI reader to make high confidence decisions can improve upon the standard of care screening outcomes by 1.9-2.5% in sensitivity and up to 0.6% in specificity (with 4.6-10.9% reduction in the number of assessments and 48-80.7% reduction in the number of reads). Automation bias degrades performance in multi-reader settings but improves it for single-readers. Using an AI reader to triage between single and multi-reader pathways can improve performance given positive human-AI interaction. This study provides insight into feasible approaches for implementing human-AI collaboration in population mammographic screening, incorporating human-AI interaction effects. Our study provides evidence to support the urgent assessment of AI-integrated screening pathways with prospective studies to validate real-world performance and open routes to clinical adoption.
What problem does this paper attempt to address?
This paper discusses the application of artificial intelligence (AI) in breast cancer screening, aiming to address the issue of how to effectively implement AI tools in clinical screening services. The study found that although AI readers performed well in detecting breast cancer compared to individual radiologists, they did not reach the level of multi-reader systems used in countries such as Australia, Sweden, and the United Kingdom. Therefore, the integration of AI needs to be achieved in multi-reader systems through human-AI collaboration. The paper simulated and analyzed five possible paths for AI integration in screening and compared them with the standard double-reading arbitration system in Australia. The study considered different human-AI interaction effects, including positive, neutral, and negative effects, to evaluate the advantages of collaboration, automation bias risks, and potential drawbacks. The results showed that using AI as a substitute for the second reader or using AI in high confidence decision-making can improve screening sensitivity and specificity, reduce the number of assessments and readings. However, automation bias may decrease performance in a multi-reader environment but may improve performance in a single-reader setting. Separating the single-reader and multi-reader paths by using AI for case triage can improve performance in the presence of positive human-AI interaction effects. The paper emphasizes the necessity of validating AI integration paths in real-world environments to promote its application in clinical settings. In conclusion, this study provides insights into the effective integration of AI in breast cancer screening, highlighting the importance of understanding and exploring human-AI collaboration models to optimize the performance and efficiency of screening programs.