Discrepancies in ASPECTS obtained by artificial intelligence and experts: Associated factors and prognostic implications

XiaoQing Cheng,Bing Tian,LiJun Huang,Shen Xi,QuanHui Liu,BaiYan Luo,HuiMin Pang,JinJing Tang,Xia Tian,YuXi Hou,LuGuang Chen,Qian Chen,WuSheng Zhu,XinDao Yin,ChenWei Shao,GuangMing Lu
DOI: https://doi.org/10.1016/j.ejrad.2024.111708
2024-08-28
Abstract:Purpose: The differences between the Alberta Stroke Program Early CT Score (ASPECTS) obtained by experts and artificial intelligence (AI) software require elucidation. We aimed to characterize the discrepancies between the ASPECTS obtained by AI and experts and determine the associated factors and prognostic implications. Methods: This multicenter, retrospective, observational cohort study included patients showing acute ischemic stroke caused by large-vessel occlusion in the anterior circulation. ASPECTS was determined by AI software (RAPID ASPECTS) and experts from the core laboratory. Interclass correlation coefficients (ICCs) and Bland-Altman plots were used to illustrate the consistency and discrepancies; logistic regression analyses were used to assess the correlates of inconsistency; and receiver operating characteristic analyses were performed to assess the diagnostic performance for predicting unfavorable clinical outcomes. Results: The study population included 491 patients. The ICC for the expert and AI ASPECTS was 0.63 (95 % confidence interval [CI]: 0.25-0.79).The mean difference between expert and AI ASPECTS was 2.24. Chronic infarcts (odds ratio [OR], 1.9; 95 % CI, 1.1-3.4; P=0.021) and expert scores in the internal capsule (OR, 2.9; 95 % CI, 1.1-7.7; P=0.034) and lentiform (OR, 2.4; 95 % CI, 1.3-4.7; P=0.008) were significant correlates of inconsistency. The ASPECTS obtained by AI showed a significantly higher area under the curve for unfavorable outcomes (0.68 vs. 0.63, P=0.04). Conclusions: In comparison with expert ASPECTS, AI ASPECTS overestimated the infarct extent. Future studies should aim to determine whether AI ASPECTS assessments should use a lower threshold to screen patients for endovascular therapy.
What problem does this paper attempt to address?