Prediction of CCL2 in breast cancer based on enhanced MRI radiomics model with integrated bioinformatics analysis and machine learning.

Chen Fu,Qiuchen Chen
DOI: https://doi.org/10.1200/jco.2024.42.16_suppl.e12509
IF: 45.3
2024-06-01
Journal of Clinical Oncology
Abstract:e12509 Background: Breast cancer is the most common malignant tumor in women. According to the 2023 Cancer Statistics Report, breast cancer has surpassed lung cancer to become the most common malignant tumor. Its incidence and mortality both rank first among female malignant tumors. This study proposes non-invasive prediction of CCL2 mRNA expression in breast cancer tissue through enhanced MRI radiomics, while integrating bioinformatics analysis to explore the underlying molecular mechanisms and their association with the immune microenvironment. Methods: Excluding criteria were applied to select primary solid tumor samples with RNA-seq data (n = 840) from the TCGA-BRCA database. Enhanced MRI imaging data (n = 98) intersecting with the TCGA-BRCA data were obtained from TCIA-BRCA. The imaging data were randomly divided into training (n = 70) and testing sets (n = 28) in a 7:3 ratio. A feature subset was selected, and a model was constructed using the support vector machine (SVM) algorithm in the training set. The predictive performance of the model was evaluated using receiver operating characteristic (ROC) curves, calibration curves, and the Hosmer-Lemeshow goodness-of-fit test. The calibration of the radiomics prediction model was assessed using the Brier score, and the clinical utility of the radiomics prediction model was demonstrated by drawing decision curves (DCA). External validation of the model was performed using a validation set. The radiomics score (RS) was calculated using the LR radiomics model and categorized as a binary variable (Low/High RS). Subsequently, KEGG pathway, immune-related genes, and gene mutation analysis were conducted based on the CCL2 high and low expression levels. Results: A total of 1810 radiomic features were obtained with ICC values ≥ 0.75. Subsequently, the mRMR (Maximum relevance, minimum redundancy) method, and the optimal 7 feature subsets were further identified using the RFE (Recursive feature elimination) algorithm. The SVM model demonstrated favorable predictive performance, with an AUC value of 0.822 for the training set and 0.779 for the validation set, as indicated by the ROC curve. Calibration curve and Hosmer-Lemeshow goodness-of-fit test revealed good consistency between the radiomic predictive model for the probability of high gene expression and the actual values (P > 0.05). Additionally, the decision curve analysis (DCA) demonstrated high clinical utility of the model. The TNF-α signaling pathway was found to influence the occurrence and development of breast cancer. Furthermore, the degree of Plasma Cell infiltration was observed to be highest in breast cancer. Conclusions: The CCL2 model constructed using a combination of machine learning and radiomic features, can non-invasively predict the prognosis of breast cancer treatment and exhibits good generalizability.
oncology
What problem does this paper attempt to address?