Bayesian Variable Selection for Parametric Survival Model with Applications to Cancer Omics Data

Weiwei Duan,Ruyang Zhang,Yang Zhao,Sipeng Shen,Yongyue Wei,Feng Chen,David C. Christiani
DOI: https://doi.org/10.1186/s40246-018-0179-x
2018-01-01
Human Genomics
Abstract:Background Modeling thousands of markers simultaneously has been of great interest in testing association between genetic biomarkers and disease or disease-related quantitative traits. Recently, an expectation-maximization (EM) approach to Bayesian variable selection (EMVS) facilitating the Bayesian computation was developed for continuous or binary outcome using a fast EM algorithm. However, it is not suitable to the analyses of time-to-event outcome in many public databases such as The Cancer Genome Atlas (TCGA). Results We extended the EMVS to high-dimensional parametric survival regression framework (SurvEMVS). A variant of cyclic coordinate descent (CCD) algorithm was used for efficient iteration in M-step, and the extended Bayesian information criteria (EBIC) was employed to make choice on hyperparameter tuning. We evaluated the performance of SurvEMVS using numeric simulations and illustrated the effectiveness on two real datasets. The results of numerical simulations and two real data analyses show the well performance of SurvEMVS in aspects of accuracy and computation. Some potential markers associated with survival of lung or stomach cancer were identified. Conclusions These results suggest that our model is effective and can cope with high-dimensional omics data.
What problem does this paper attempt to address?