Abstract:BACKGROUND: Chest radiograph interpretation is critical for the detection of thoracic diseases, including tuberculosis and lung cancer, which affect millions of people worldwide each year. This time-consuming task typically requires expert radiologists to read the images, leading to fatigue-based diagnostic error and lack of diagnostic expertise in areas of the world where radiologists are not available. Recently, deep learning approaches have been able to achieve expert-level performance in medical image interpretation tasks, powered by large network architectures and fueled by the emergence of large labeled datasets. The purpose of this study is to investigate the performance of a deep learning algorithm on the detection of pathologies in chest radiographs compared with practicing radiologists.METHODS AND FINDINGS: We developed CheXNeXt, a convolutional neural network to concurrently detect the presence of 14 different pathologies, including pneumonia, pleural effusion, pulmonary masses, and nodules in frontal-view chest radiographs. CheXNeXt was trained and internally validated on the ChestX-ray8 dataset, with a held-out validation set consisting of 420 images, sampled to contain at least 50 cases of each of the original pathology labels. On this validation set, the majority vote of a panel of 3 board-certified cardiothoracic specialist radiologists served as reference standard. We compared CheXNeXt's discriminative performance on the validation set to the performance of 9 radiologists using the area under the receiver operating characteristic curve (AUC). The radiologists included 6 board-certified radiologists (average experience 12 years, range 4-28 years) and 3 senior radiology residents, from 3 academic institutions. We found that CheXNeXt achieved radiologist-level performance on 11 pathologies and did not achieve radiologist-level performance on 3 pathologies. The radiologists achieved statistically significantly higher AUC performance on cardiomegaly, emphysema, and hiatal hernia, with AUCs of 0.888 (95% confidence interval [CI] 0.863-0.910), 0.911 (95% CI 0.866-0.947), and 0.985 (95% CI 0.974-0.991), respectively, whereas CheXNeXt's AUCs were 0.831 (95% CI 0.790-0.870), 0.704 (95% CI 0.567-0.833), and 0.851 (95% CI 0.785-0.909), respectively. CheXNeXt performed better than radiologists in detecting atelectasis, with an AUC of 0.862 (95% CI 0.825-0.895), statistically significantly higher than radiologists' AUC of 0.808 (95% CI 0.777-0.838); there were no statistically significant differences in AUCs for the other 10 pathologies. The average time to interpret the 420 images in the validation set was substantially longer for the radiologists (240 minutes) than for CheXNeXt (1.5 minutes). The main limitations of our study are that neither CheXNeXt nor the radiologists were permitted to use patient history or review prior examinations and that evaluation was limited to a dataset from a single institution.CONCLUSIONS: In this study, we developed and validated a deep learning algorithm that classified clinically important abnormalities in chest radiographs at a performance level comparable to practicing radiologists. Once tested prospectively in clinical settings, the algorithm could have the potential to expand patient access to chest radiograph diagnostics.

CheXphotogenic: Generalization of Deep Learning Models for Chest X-ray Interpretation to Photos of Chest X-rays

CheXtransfer: Performance and Parameter Efficiency of ImageNet Models for Chest X-Ray Interpretation

Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists

Deep neural models for automated multi-task diagnostic scan management—quality enhancement, view classification and report generation

CheXDouble: Dual-Supervised interpretable disease diagnosis model

CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

CheSS: Chest X-Ray Pre-trained Model via Self-supervised Contrastive Learning

Screening Patient Misidentification Errors Using a Deep Learning Model of Chest Radiography: A Seven Reader Study

Image projective transformation rectification with synthetic data for smartphone-captured chest X-ray photos classification

Generalizable diagnosis of chest radiographs through attention-guided decomposition of images utilizing self-consistency loss

CheXseen: Unseen Disease Detection for Deep Learning Interpretation of Chest X-rays

Synthetically Enhanced: Unveiling Synthetic Data's Potential in Medical Imaging Research

Can we trust deep learning models diagnosis? The impact of domain shift in chest radiograph classification

Can Artificial Intelligence Reliably Report Chest X-Rays?: Radiologist Validation of an Algorithm trained on 2.3 Million X-Rays

Evaluation of Contemporary Convolutional Neural Network Architectures for Detecting COVID-19 from Chest Radiographs

CheXclusion: Fairness gaps in deep chest X-ray classifiers

CheX-Nomaly: Segmenting Lung Abnormalities from Chest Radiographs using Machine Learning

Comparative Evaluation of Digital and Analog Chest Radiographs to Identify Tuberculosis using Deep Learning Model

Using Convolutional Neural Networks for the Classification of Suboptimal Chest Radiographs

Automatic Localization and Identification of Thoracic Diseases from Chest X-rays with Deep Learning.

Assessing the Performance of Automated Prediction and Ranking of Patient Age from Chest X-rays Against Clinicians