Automation Bias in AI-Assisted Detection of Cerebral Aneurysms on Time-of-Flight MR-Angiography

Su Hwan Kim,Severin Schramm,Evamaria Olga Riedel,Lena Schmitzer,Enrike Rosenkranz,Olivia Kertels,Jannis Bodden,Karolin Paprottka,Dominik Sepp,Martin Renz,Jan Kirschke,Thomas Baum,Christian Maegerlein,Tobias Boeckh-Behrens,Claus Zimmer,Benedikt Wiestler,Dennis Martin Hedderich
DOI: https://doi.org/10.1101/2024.05.31.24308021
2024-06-03
Abstract:Background AI systems have the potential to support in detecting cerebral aneurysms. Yet, the role of automation bias (inclination of humans to overly rely on automated decision-making systems) in AI-assisted cerebral aneurysm detection remains unclear. Purpose To determine how automation bias can affect radiologists with varying experience levels when reading time-of-flight magnetic resonance angiography (TOF-MRA) studies with the assistance of an AI system for cerebral aneurysm detection. Methods In this prospective experiment, nine radiologists with varying levels of experience evaluated twenty TOF-MRA exams for the presence of anterior circulation aneurysms, with each arterial segment rated on a 4-point Likert scale, and provided follow-up recommendations. Every case was evaluated twice (with or without assistance by the AI software mdbrain), with a washout-period of at least four weeks between the two sessions. Ten out of twenty cases included at least one false-positive AI finding. Aneurysm ratings, follow-up recommendations, and reading times were assessed using the Wilcoxon signed-rank test. A thematic analysis was performed to summarize reader feedback and observations. Results False-positive AI results led to significantly higher suspicion of aneurysm findings (p = 0.01). Inexperienced readers further recommended significantly more aggressive follow-up examinations when presented with false-positive AI findings (p = 0.005). Reading times were significantly shorter with AI assistance in inexperienced (164.1 vs 228.2 seconds; p < 0.001), moderately experienced (126.2 vs 156.5 seconds; p < 0.009), and very experienced (117.9 vs 153.5 seconds; p < 0.001) readers alike. Conclusion Our results demonstrate susceptibility of radiology readers to automation bias in detecting cerebral aneurysms in TOF-MRA studies when encountering false-positive AI findings. In inexperienced readers, this behavior further translated into more aggressive follow-up recommendations. AI assistance resulted in significantly shorter reading times across experience levels. While AI systems for cerebral aneurysm detection can provide benefits, challenges in human-AI interaction need to be mitigated to ensure safe and effective adoption.
What problem does this paper attempt to address?