A Comparison Study of Artificial Intelligence Performance Against Doctors in Benign-Malignant Classification of Pulmonary Nodules

Weiguo Hu,Jie Zhang,Dingyi Zhou,Shu Xia,Xingxiang Pu,Jianzhong Cao,Mingzhu Zou,Zhangfan Mao,Qibin Song,Xiaodong Zhang
DOI: https://doi.org/10.1515/oncologie-2023-0319
2024-01-01
ONCOLOGIE
Abstract:Objectives: To compare and evaluate the performance of artificial intelligence (AI) against physicians in classifying benign and malignant pulmonary nodules from computerized tomography (CT) images. Methods: A total of 506 CT images with pulmonary nodules were retrospectively collected. The AI was trained using in-house software. For comparing the diagnostic performance of artificial intelligence and different groups of physicians in pulmonary nodules, statistical methods of receiver operating characteristic (ROC) curve and area under the curve (AUC) were analyzed. The nodules in CT images were analyzed in a case-by-case manner. Results: The diagnostic accuracy of AI surpassed that of all groups of physicians, exhibiting an AUC of 0.88 alongside a sensitivity of 0.80, specificity of 0.84, and accuracy of 0.83. The area under the curve (AUC) of seven groups of physicians varies between 0.63 and 0.84. The sensitivity of the physicians within these groups varies between 0.4 and 0.76. The specificity of different groups ranges from 0.8 to 0.85. Furthermore, the accuracy of the seven groups ranges from 0.7 to 0.82. The professional insights for enhancing deep learning models were obtained through an examination conducted on a per-case basis. Conclusions: AI demonstrated great potential in the benign-malignant classification of pulmonary nodules with higher accuracy. More accurate information will be provided by AI when making clinical decisions.
What problem does this paper attempt to address?