Radiotranscriptomics of non-small cell lung carcinoma for assessing high-level clinical outcomes using a machine learning-derived multi-modal signature

Eleftherios Trivizakis,Nikoletta-Maria Koutroumpa,John Souglakos,Apostolos Karantanas,Michalis Zervakis,Kostas Marias
DOI: https://doi.org/10.1186/s12938-023-01190-z
2023-12-15
Abstract:Background: Multi-omics research has the potential to holistically capture intra-tumor variability, thereby improving therapeutic decisions by incorporating the key principles of precision medicine. The purpose of this study is to identify a robust method of integrating features from different sources, such as imaging, transcriptomics, and clinical data, to predict the survival and therapy response of non-small cell lung cancer patients. Methods: 2996 radiomics, 5268 transcriptomics, and 8 clinical features were extracted from the NSCLC Radiogenomics dataset. Radiomics and deep features were calculated based on the volume of interest in pre-treatment, routine CT examinations, and then combined with RNA-seq and clinical data. Several machine learning classifiers were used to perform survival analysis and assess the patient's response to adjuvant chemotherapy. The proposed analysis was evaluated on an unseen testing set in a k-fold cross-validation scheme. Score- and concatenation-based multi-omics were used as feature integration techniques. Results: Six radiomics (elongation, cluster shade, entropy, variance, gray-level non-uniformity, and maximal correlation coefficient), six deep features (NasNet-based activations), and three transcriptomics (OTUD3, SUCGL2, and RQCD1) were found to be significant for therapy response. The examined score-based multi-omic improved the AUC up to 0.10 on the unseen testing set (0.74 ± 0.06) and the balance between sensitivity and specificity for predicting therapy response for 106 patients, resulting in less biased models and improving upon the either highly sensitive or highly specific single-source models. Six radiomics (kurtosis, GLRLM- and GLSZM-based non-uniformity from images with no filtering, biorthogonal, and daubechies wavelets), seven deep features (ResNet-based activations), and seven transcriptomics (ELP3, ZZZ3, PGRMC2, TRAK1, ATIC, USP7, and PNPLA2) were found to be significant for the survival analysis. Accordingly, the survival analysis for 115 patients was also enhanced up to 0.20 by the proposed score-based multi-omics in terms of the C-index (0.79 ± 0.03). Conclusions: Compared to single-source models, multi-omics integration has the potential to improve prediction performance, increase model stability, and reduce bias for both treatment response and survival analysis.
What problem does this paper attempt to address?