Gene expression-based machine learning approach to predict immunotherapy response and survival in patients with advanced cancer.
Zheng Zhang,Xianyun Wang,Qi Tang,Kaiwei Yang,Xuanjun Guo,Aixiang Wang,Yiqun Zhang,Yantong Hao,Zhihua Pei,Dongliang Wang,Qiming Zhou,Zhisong He
DOI: https://doi.org/10.1200/jco.2023.41.16_suppl.1550
IF: 45.3
2023-06-01
Journal of Clinical Oncology
Abstract:1550 Background: Immune checkpoint blockades (ICBs) have drastically improved the clinical outcomes of cancer patients, while only ~30% of them get significant benefits from immunotherapy. TMB, PD-L1, TNB, GEP (Cristescu et al. 2018) and other conventional biomarkers can explain only ~60% of the ICB response, suggesting that novel factors are yet to be discovered. Methods: Gene expression profiles and clinical information of 672 patients were collected from IMvigor210 (n=348, bladder cancer), Pender et al. (n=98, 20 tumor types), Kim et al. (n=45, gastric cancer) and CheckMate (n=181, kidney cancer) cohorts, in which all patients were treated with ICBs. In each sample, gene expression intensities normalized by RPKM were ranked in descender order (range of 1-20000) and scaled into the 0-1 range lastly. To identify immune signatures that correlated with both response and survival, 1000+ immune genes were downloaded from InnatDB database, and their expression values were compared between the two groups of IMvigor210 data which were divided according to response (CR+PR vs. SD+PD, Wilcoxon test) or overall survival (≥mOS vs. <mOS, Cox survival analysis). Bayesian-regularization neural networks (BRNN) model was used to develop the risk model to predict ICB response and prognosis. 75% of IMvigor210 data were randomly selected as the training set, and the remaining 25% as the test set. The performance of predictor was independently validated in Pender et al., Kim et al., and CheckMate cohorts subsequently. ROC curves were used to evaluate the predictive accuracy to response. Survival plots were created using the Kaplan-Meier estimator, and data were analyzed by log-rank test. Results: 40 DEGs (including top20 DEGs positively correlated with ORR/OS and top20 DEGs negatively correlated with ORR/OS) were identified as immune signatures, in which each gene signature could significantly distinguish ICB response and survival. Our model outperformed TMB or GEP in the prediction of response in IMvigor210 (Model = 0.986 vs. TMB = 0.728 vs. GEP = 0.551), Pender et al. (Model = 0.757 vs. TMB = 0.593 vs. GEP = 0.621), Kim et al. (Model = 0.881 vs. GEP = 0.828), and CheckMate (Model = 0.927 vs. TMB = 0.543) cohorts when AUC was used as performance metric. Meanwhile, our model showed promising results for differentiating longer survivors from shorter survivors with p vale <0.0001, 0.064, 0.021, and 0.0006 in IMvigor210, Pender et al., Kim et al., and CheckMate cohorts respectively. Conclusions: We proposed a deep learning model to predict ICB response and survival, in which feature selection based on immune gene expression order, allowing all kinds of platforms and expression data types. Our model showed a robust prediction not only in bladder cancer, but also in other pan-cancer types.
oncology