Deep-learning and conventional radiomics to predict IDH genotyping status based on magnetic resonance imaging data in adult diffuse glioma
Hongjian Zhang,Xiao Fan,Junxia Zhang,Zhiyuan Wei,Wei Feng,Yifang Hu,Jiaying Ni,Fushen Yao,Gaoxin Zhou,Cheng Wan,Xin Zhang,Junjie Wang,Yun Liu,Yongping You,Yun Yu
DOI: https://doi.org/10.3389/fonc.2023.1143688
IF: 4.7
2023-08-30
Frontiers in Oncology
Abstract:Objectives: In adult diffuse glioma, preoperative detection of isocitrate dehydrogenase ( IDH ) status helps clinicians develop surgical strategies and evaluate patient prognosis. Here, we aim to identify an optimal machine-learning model for prediction of IDH genotyping by combining deep-learning (DL) signatures and conventional radiomics (CR) features as model predictors. Methods: In this study, a total of 486 patients with adult diffuse gliomas were retrospectively collected from our medical center (n=268) and the public database (TCGA, n=218). All included patients were randomly divided into the training and validation sets by using nested 10-fold cross-validation. A total of 6,736 CR features were extracted from four MRI modalities in each patient, namely T1WI, T1CE, T2WI, and FLAIR. The LASSO algorithm was performed for CR feature selection. In each MRI modality, we applied a CNN+LSTM–based neural network to extract DL features and integrate these features into a DL signature after the fully connected layer with sigmoid activation. Eight classic machine-learning models were analyzed and compared in terms of their prediction performance and stability in IDH genotyping by combining the LASSO–selected CR features and integrated DL signatures as model predictors. In the validation sets, the prediction performance was evaluated by using accuracy and the area under the curve (AUC) of the receiver operating characteristics, while the model stability was analyzed by using the relative standard deviation of the AUC (RSD AUC ). Subgroup analyses of DL signatures and CR features were also individually conducted to explore their independent prediction values. Results: Logistic regression (LR) achieved favorable prediction performance (AUC: 0.920 ± 0.043, accuracy: 0.843 ± 0.044), whereas support vector machine with the linear kernel (l-SVM) displayed low prediction performance (AUC: 0.812 ± 0.052, accuracy: 0.821 ± 0.050). With regard to stability, LR also showed high robustness against data perturbation (RSD AUC : 4.7%). Subgroup analyses showed that DL signatures outperformed CR features (DL, AUC: 0.915 ± 0.054, accuracy: 0.835 ± 0.061, RSD AUC : 5.9%; CR, AUC: 0.830 ± 0.066, accuracy: 0.771 ± 0.051, RSD AUC : 8.0%), while DL and DL+CR achieved similar prediction results. Conclusion: In IDH genotyping, LR is a promising machine-learning classification model. Compared with CR features, DL signatures exhibit markedly superior prediction values and discriminative capability.
oncology