A Deep-Learning Model for Intracranial Aneurysm Detection on CT Angiography Images in China: a Stepwise, Multicentre, Early-Stage Clinical Validation Study
Bin Hu,Zhao Shi,Li Lu,Zhongchang Miao,Hao Wang,Zhen Zhou,Fandong Zhang,Rongpin Wang,Xiao Luo,Feng Xu,Sheng Li,Xiangming Fang,Xiaodong Wang,Ge Yan,Fajin Lv,Meng Zhang,Qiu Sun,Guangbin Cui,Yubao Liu,Shu Zhang,Chengwei Pan,Zhibo Hou,Huiying Liang,Yuning Pan,Xiaoxia Chen,Xiaorong Li,Fei Zhou,U. Joseph Schoepf,Akos Varga-Szemes,W. Garrison Moore,Yizhou Yu,Chunfeng Hu,Long Jiang Zhang
DOI: https://doi.org/10.1016/s2589-7500(23)00268-6
2024-01-01
Abstract:Background Artificial intelligence (AI) models in real -world implementation are scarce. Our study aimed to develop a CT angiography (CTA)-based AI model for intracranial aneurysm detection, assess how it helps clinicians improve diagnostic performance, and validate its application in real -world clinical implementation. Methods We developed a deep-learning model using 16 546 head and neck CTA examination images from 14 517 patients at eight Chinese hospitals. Using an adapted, stepwise implementation and evaluation, 120 certified clinicians from 15 geographically different hospitals were recruited. Initially, the AI model was externally validated with images of 900 digital subtraction angiography-verified CTA cases (examinations) and compared with the performance of 24 clinicians who each viewed 300 of these cases (stage 1). Next, as a further external validation a multi-reader multi -case study enrolled 48 clinicians to individually review 298 digital subtraction angiography-verified CTA cases (stage 2). The clinicians reviewed each CTA examination twice (ie, with and without the AI model), separated by a 4-week washout period. Then, a randomised open -label comparison study enrolled 48 clinicians to assess the acceptance and performance of this AI model (stage 3). Finally, the model was prospectively deployed and validated in 1562 real -world clinical CTA cases. Findings The AI model in the internal dataset achieved a patient-level diagnostic sensitivity of 0957 (95% CI 0939-0971) and a higher patient-level diagnostic sensitivity than clinicians (0943 [0921-0961] vs 0658 [0644-0672]; p<00001) in the external dataset. In the multi-reader multi -case study, the AI-assisted strategy improved clinicians' diagnostic performance both on a per -patient basis (the area under the receiver operating characteristic curves [AUCs]; 0795 [0761-0830] without AI vs 0878 [0850-0906] with AI; p<00001) and a peraneurysm basis (the area under the weighted alternative free-response receiver operating characteristic curves; 0765 [0732-0799] vs 0865 [0839-0891]; p<00001). Reading time decreased with the aid of the AI model (875 s vs 827 s, p<00001). In the randomised open -label comparison study, clinicians in the AI-assisted group had a high acceptance of the AI model (926% adoption rate), and a higher AUC when compared with the control group (0858 [95% CI 0850-0866] vs 0789 [0780-0799]; p<00001). In the prospective study, the AI model had a 051% (8/1570) error rate due to poor-quality CTA images and recognition failure. The model had a high negative predictive value of 0998 (0994-1000) and significantly improved the diagnostic performance of clinicians; AUC improved from 0787 (95% CI 0766-0808) to 0909 (0894-0923; p<00001) and patient-level sensitivity improved from 0590 (0511-0666) to 0825 (0759-0880; p<00001). Interpretation This AI model demonstrated strong clinical potential for intracranial aneurysm detection with improved clinician diagnostic performance, high acceptance, and practical implementation in real -world clinical cases.