Automated coronary calcium scoring using deep learning with multicenter external validation
David Eng,Christopher Chute,Nishith Khandwala,Pranav Rajpurkar,Jin Long,Sam Shleifer,Mohamed H. Khalaf,Alexander T. Sandhu,Fatima Rodriguez,David J. Maron,Saeed Seyyedi,Daniele Marin,Ilana Golub,Matthew Budoff,Felipe Kitamura,Marcelo Straus Takahashi,Ross W. Filice,Rajesh Shah,John Mongan,Kimberly Kallianos,Curtis P. Langlotz,Matthew P. Lungren,Andrew Y. Ng,Bhavik N. Patel
DOI: https://doi.org/10.1038/s41746-021-00460-1
IF: 15.2
2021-01-01
npj Digital Medicine
Abstract:Coronary artery disease (CAD), the most common manifestation of cardiovascular disease, remains the most common cause of mortality in the United States. Risk assessment is key for primary prevention of coronary events and coronary artery calcium (CAC) scoring using computed tomography (CT) is one such non-invasive tool. Despite the proven clinical value of CAC, the current clinical practice implementation for CAC has limitations such as the lack of insurance coverage for the test, need for capital-intensive CT machines, specialized imaging protocols, and accredited 3D imaging labs for analysis (including personnel and software). Perhaps the greatest gap is the millions of patients who undergo routine chest CT exams and demonstrate coronary artery calcification, but their presence is not often reported or quantitation is not feasible. We present two deep learning models that automate CAC scoring demonstrating advantages in automated scoring for both dedicated gated coronary CT exams and routine non-gated chest CTs performed for other reasons to allow opportunistic screening. First, we trained a gated coronary CT model for CAC scoring that showed near perfect agreement (mean difference in scores = −2.86; Cohen’s Kappa = 0.89, P < 0.0001) with current conventional manual scoring on a retrospective dataset of 79 patients and was found to perform the task faster (average time for automated CAC scoring using a graphics processing unit (GPU) was 3.5 ± 2.1 s vs. 261 s for manual scoring) in a prospective trial of 55 patients with little difference in scores compared to three technologists (mean difference in scores = 3.24, 5.12, and 5.48, respectively). Then using CAC scores from paired gated coronary CT as a reference standard, we trained a deep learning model on our internal data and a cohort from the Multi-Ethnic Study of Atherosclerosis (MESA) study (total training n = 341, Stanford test n = 42, MESA test n = 46) to perform CAC scoring on routine non-gated chest CT exams with validation on external datasets (total n = 303) obtained from four geographically disparate health systems. On identifying patients with any CAC (i.e., CAC ≥ 1), sensitivity and PPV was high across all datasets (ranges: 80–100% and 87–100%, respectively). For CAC ≥ 100 on routine non-gated chest CTs, which is the latest recommended threshold to initiate statin therapy, our model showed sensitivities of 71–94% and positive predictive values in the range of 88–100% across all the sites. Adoption of this model could allow more patients to be screened with CAC scoring, potentially allowing opportunistic early preventive interventions.