Prospective Evaluation of Artificial Intelligence Triage of Intracranial Hemorrhage on Noncontrast Head CT Examinations
Cody H Savage,Manoj Tanwar,Asser Abou Elkassem,Adam Sturdivant,Omar Hamki,Houman Sotoudeh,Gopi Sirineni,Aparna Singhal,Desmin Milner,Jesse Jones,Dirk Rehder,Mei Li,Yufeng Li,Kevin Junck,Srini Tridandapani,Steven A Rothenberg,Andrew D Smith
DOI: https://doi.org/10.2214/AJR.24.31639
2024-09-04
Abstract:Background: Retrospective studies evaluating artificial intelligence (AI) algorithms for intracranial hemorrhage (ICH) detection on noncontrast CT (NCCT) have shown promising results but lack prospective validation. Objective: To evaluate the impact on radiologists' real-world aggregate performance for ICH detection and report turnaround times for ICH-positive examinations of a radiology department's implementation of an AI triage and notification system for ICH detection on head NCCT examinations. Methods: This prospective single-center study included adult patients who underwent head NCCT examinations from May 12, 2021 to June 30, 2021 (phase 1) or September 30, 2021 to December 4, 2021 (phase 2). Before phase 1, the radiology department implemented a commercial AI triage system for ICH detection that processed head NCCT examinations and notified radiologists of positive results through a widget with a floating pop-up display. Examinations were interpreted by neuroradiologists or emergency radiologists, who evaluated examinations without and with AI assistance in phase 1 and phase 2, respectively. A panel of radiologists conducted a review process for all examinations with discordance between the radiology report and AI and a subset of remaining examinations, to establish the reference standard. Diagnostic performance and report turnaround times were compared using Pearson chi-square test and Wilcoxon rank-sum test, respectively. Bonferroni correction was used to account for five diagnostic performance metrics (adjusted significance threshold, .01 [α=.05/5]). Results: A total of 9954 examinations from 7371 patients (mean age, 54.8±19.8 years; 3773 female, 3598 male) were included. In phases 1 and 2, 19.8% (735/3716) and 21.9% (1368/6238) of examinations, respectively, were positive for ICH (P=.01). Radiologists without versus with AI showed no significant difference in accuracy (99.5% vs 99.2%), sensitivity (98.6% vs 98.9%), PPV (99.0% vs 99.7%), or NPV (99.7% vs 99.7%) (all P>.01); specificity was higher for radiologists without than with AI (99.8% vs 99.3%, respectively, P=.004). Mean report turnaround time for ICH-positive examinations was 147.1 minutes without AI versus 149.9 minutes with AI (P=.11). Conclusion: An AI triage system for ICH detection did not improve radiologists' diagnostic performance or report turnaround times. Clinical Impact: This large prospective real-world study does not support use of AI assistance for ICH detection.