Automated segment-level coronary artery calcium scoring on non-contrast CT: a multi-task deep-learning approach

Bernhard Föllmer,Sotirios Tsogias,Federico Biavati,Kenrick Schulze,Maria Bosserdt,Lars Gerrit Hövermann,Sebastian Stober,Wojciech Samek,Klaus F Kofoed,Pál Maurovich-Horvat,Patrick Donnelly,Theodora Benedek,Michelle C Williams,Marc Dewey
DOI: https://doi.org/10.1186/s13244-024-01827-0
2024-10-16
Abstract:Objectives: To develop and evaluate a multi-task deep-learning (DL) model for automated segment-level coronary artery calcium (CAC) scoring on non-contrast computed tomography (CT) for precise localization and quantification of calcifications in the coronary artery tree. Methods: This study included 1514 patients (mean age, 60.0 ± 10.2 years; 56.0% female) with stable chest pain from 26 centers participating in the multicenter DISCHARGE trial (NCT02400229). The patients were randomly assigned to a training/validation set (1059) and a test set (455). We developed a multi-task neural network for performing the segmentation of calcifications on the segment level as the main task and the segmentation of coronary artery segment regions with weak annotations as an auxiliary task. Model performance was evaluated using (micro-average) sensitivity, specificity, F1-score, and weighted Cohen's κ for segment-level agreement based on the Agatston score and performing interobserver variability analysis. Results: In the test set of 455 patients with 1797 calcifications, the model assigned 73.2% (1316/1797) to the correct coronary artery segment. The model achieved a micro-average sensitivity of 0.732 (95% CI: 0.710-0.754), a micro-average specificity of 0.978 (95% CI: 0.976-0.980), and a micro-average F1-score of 0.717 (95% CI: 0.695-0.739). The segment-level agreement was good with a weighted Cohen's κ of 0.808 (95% CI: 0.790-0.824), which was only slightly lower than the agreement between the first and second observer (0.809 (95% CI: 0.798-0.845)). Conclusion: Automated segment-level CAC scoring using a multi-task neural network approach showed good agreement on the segment level, indicating that DL has the potential for automated coronary artery calcification classification. Critical relevance statement: Multi-task deep learning can perform automated coronary calcium scoring on the segment level with good agreement and may contribute to the development of new and improved calcium scoring methods. Key points: Segment-level coronary artery calcium scoring is a tedious and error-prone task. The proposed multi-task model achieved good agreement with a human observer on the segment level. Deep learning can contribute to the automation of segment-level coronary artery calcium scoring.
What problem does this paper attempt to address?