Deep learning-based COVID-19 pneumonia classification using chest CT images: model generalizability

Dan Nguyen,Fernando Kay,Jun Tan,Yulong Yan,Yee Seng Ng,Puneeth Iyengar,Ron Peshock,Steve Jiang

DOI: https://doi.org/10.48550/arXiv.2102.09616

2021-02-19

Abstract:Since the outbreak of the COVID-19 pandemic, worldwide research efforts have focused on using artificial intelligence (AI) technologies on various medical data of COVID-19-positive patients in order to identify or classify various aspects of the disease, with promising reported results. However, concerns have been raised over their generalizability, given the heterogeneous factors in training datasets. This study aims to examine the severity of this problem by evaluating deep learning (DL) classification models trained to identify COVID-19-positive patients on 3D computed tomography (CT) datasets from different countries. We collected one dataset at UT Southwestern (UTSW), and three external datasets from different countries: CC-CCII Dataset (China), COVID-CTset (Iran), and MosMedData (Russia). We divided the data into 2 classes: COVID-19-positive and COVID-19-negative patients. We trained nine identical DL-based classification models by using combinations of the datasets with a 72% train, 8% validation, and 20% test data split. The models trained on a single dataset achieved accuracy/area under the receiver operating characteristics curve (AUC) values of 0.87/0.826 (UTSW), 0.97/0.988 (CC-CCCI), and 0.86/0.873 (COVID-CTset) when evaluated on their own dataset. The models trained on multiple datasets and evaluated on a test set from one of the datasets used for training performed better. However, the performance dropped close to an AUC of 0.5 (random guess) for all models when evaluated on a different dataset outside of its training datasets. Including the MosMedData, which only contained positive labels, into the training did not necessarily help the performance on the other datasets. Multiple factors likely contribute to these results, such as patient demographics and differences in image acquisition or reconstruction, causing a data shift among different study cohorts.

Medical Physics,Machine Learning,Image and Video Processing

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to evaluate the generalization ability of deep - learning classification models in identifying COVID - 19 - positive patients. Specifically, the researchers are concerned with whether these models can be effectively applied to datasets from different countries, not just the datasets on which they were trained. Due to factors such as possible demographic differences and pre - existing clinical conditions in the training datasets, the generalization ability of the models has been questioned. Therefore, this study aims to evaluate the generalization performance of deep - learning classification models by using 3D computed tomography (CT) datasets from different countries. The paper mentions that although some developed AI models have shown very high accuracy in diagnosing and predicting COVID - 19 - positive patients, when these models are faced with unseen datasets, their performance drops significantly, approaching the level of random guessing (AUC close to 0.5). This indicates that although the test results on the same dataset are encouraging, the application of these models in the actual clinical environment may be limited, especially when dealing with data from different regions or hospitals. The researchers train multiple deep - learning models and use different combinations of datasets for testing to explore how to improve the generalization ability of the models and identify the key factors affecting model performance.

Deep learning-based COVID-19 pneumonia classification using chest CT images: model generalizability

Evaluating Generalizability of Deep Learning Models Using Indian-COVID-19 CT Dataset

Deep Learning-based Multi-Class COVID-19 Classification with X-ray Images

A Generalizable Artificial Intelligence Model for COVID-19 Classification Task Using Chest X-ray Radiographs: Evaluated Over Four Clinical Datasets with 15,097 Patients

The usage of deep neural network improves distinguishing COVID-19 from other suspected viral pneumonia by clinicians on chest CT: a real-world study

Deep learning-based Covid-19 diagnosis: a thorough assessment with a focus on generalization capabilities

Clinical Applicable AI System Based on Deep Learning Algorithm for Differentiation of Pulmonary Infectious Disease

Machine Learning Automatically Detects COVID-19 using Chest CTs in a Large Multicenter Cohort

Deep Learning-Based Recognizing COVID-19 and other Common Infectious Diseases of the Lung by Chest CT Scan Images

Systematic investigation into generalization of COVID-19 CT deep learning models with Gabor ensemble for lung involvement scoring

Deep learning for COVID-19 detection based on CT images

A novel deep learning-based method for COVID-19 pneumonia detection from CT images

Deep Learning for COVID-19 Chest CT (computed Tomography) Image Analysis: a Lesson from Lung Cancer

Federated deep learning for detecting COVID-19 lung abnormalities in CT: a privacy-preserving multinational validation study

Using Artificial Intelligence to Detect COVID-19 and Community-acquired Pneumonia Based on Pulmonary CT: Evaluation of the Diagnostic Accuracy

Automated Diagnosis of COVID-19 Using Deep Learning and Data Augmentation on Chest CT

Assisting scalable diagnosis automatically via CT images in the combat against COVID-19

Generalisation challenges in deep learning models for medical imagery: insights from external validation of COVID-19 classifiers

A Novel Automated Classification and Segmentation for COVID-19 using 3D CT Scans

A Deep Learning Approach to Characterize 2019 Coronavirus Disease (COVID-19) Pneumonia in Chest CT Images

Classification of COVID-19 Patients with their Severity Level from Chest CT Scans using Transfer Learning