Reinvestigating the Performance of Artificial Intelligence Classification Algorithms on COVID-19 X-Ray and CT Images

Rui Cao,Yanan Liu,Xin Wen,Caiqing Liao,Xin Wang,Yuan Gao,Tao Tan
DOI: https://doi.org/10.1016/j.isci.2024.109712
IF: 5.8
2024-01-01
iScience
Abstract:There are concerns that artificial intelligence (AI) algorithms may create underdiagnosis bias by mislabeling patient individuals with certain attributes (e.g., female and young) as healthy. Addressing this bias is crucial given the urgent need for AI diagnostics facing rapidly spreading infectious diseases like COVID-19. We find the prevalent AI diagnostic models show an underdiagnosis rate among specific patient populations, and the underdiagnosis rate is higher in some intersectional specific patient populations (for example, females aged 20-40 years). Additionally, we find training AI models on heterogeneous datasets (positive and negative samples from different datasets) may lead to poor model generalization. The model's classification performance varies significantly across test sets, with the accuracy of the better performance being over 40% higher than that of the poor performance. In conclusion, we developed an AI bias analysis pipeline to help researchers recognize and address biases that impact medical equality and ethics.
What problem does this paper attempt to address?