Development of Omics Data Based Survival Models for Four Female Cancers Using Machine Learning Approaches

SANG HaoKai,GUO ShuLi,QU Hong,ZHAO Min,QU DaCheng
DOI: https://doi.org/10.1360/n052018-00265
2019-01-01
Scientia Sinica Vitae
Abstract:Breast cancer, cervical and endocervical cancer, endometrial cancer and ovarian cancer are common cancers in women. Due to the malignant development of cancer and the lack of effective early diagnosis and prognosis monitor, these cancers are the top diseases causing death among female patients. To explore whether high-throughput omics data can contribute to the prognosis of cancer patients, this study used clinical data and multidimensional omics data (including DNA methylation, mRNA expression, miRNA expression and chip-based protein expression data) of 1861 samples of four female cancers in the Cancer Genome Atlas project to construct Cox proportional hazards models and random survival forest models for retrospective prediction of patient survival. Our systematic integration found that DNA methylation and miRNA expression data could significantly improve the survival predictability in patients with cervical and endometrial cancers compared with clinical data alone (the prediction efficiency increased by 8.73%–15.03%). Although some omics data contribute to the performance improvement of survival prediction models for specific cancer patients, it does not improve the predictive performance of models in other cancers. In conclusion, our study provide the insights into the omics-based survival predictions, which may have important contributions to improving the predictive accuracy of clinical survival analysis.
What problem does this paper attempt to address?