Deep learning-based multi-modal data integration enhancing breast cancer disease-free survival prediction
Zehua Wang,Ruichong Lin,Yanchun Li,Jin Zeng,Yongjian Chen,Wenhao Ouyang,Han Li,Xueyan Jia,Zijia Lai,Yunfang Yu,Herui Yao,Weifeng Su
DOI: https://doi.org/10.1093/pcmedi/pbae012
2024-05-29
Abstract:Background: The prognosis of breast cancer is often unfavorable, emphasizing the need for early metastasis risk detection and accurate treatment predictions. This study aimed to develop a novel multi-modal deep learning model using preoperative data to predict disease-free survival (DFS). Methods: We retrospectively collected pathology imaging, molecular and clinical data from The Cancer Genome Atlas and one independent institution in China. We developed a novel Deep Learning Clinical Medicine Based Pathological Gene Multi-modal (DeepClinMed-PGM) model for DFS prediction, integrating clinicopathological data with molecular insights. The patients included the training cohort (n = 741), internal validation cohort (n = 184), and external testing cohort (n = 95). Result: Integrating multi-modal data into the DeepClinMed-PGM model significantly improved area under the receiver operating characteristic curve (AUC) values. In the training cohort, AUC values for 1-, 3-, and 5-year DFS predictions increased to 0.979, 0.957, and 0.871, while in the external testing cohort, the values reached 0.851, 0.878, and 0.938 for 1-, 2-, and 3-year DFS predictions, respectively. The DeepClinMed-PGM's robust discriminative capabilities were consistently evident across various cohorts, including the training cohort [hazard ratio (HR) 0.027, 95% confidence interval (CI) 0.0016-0.046, P < 0.0001], the internal validation cohort (HR 0.117, 95% CI 0.041-0.334, P < 0.0001), and the external cohort (HR 0.061, 95% CI 0.017-0.218, P < 0.0001). Additionally, the DeepClinMed-PGM model demonstrated C-index values of 0.925, 0.823, and 0.864 within the three cohorts, respectively. Conclusion: This study introduces an approach to breast cancer prognosis, integrating imaging and molecular and clinical data for enhanced predictive accuracy, offering promise for personalized treatment strategies.