Identification of Genes Involved in Breast Cancer Metastasis by Integrating Protein-Protein Interaction Information with Expression Data.

Xin Tian,Mingyuan Xin,Jian Luo,Mingyao Liu,Zhenran Jiang
DOI: https://doi.org/10.1089/cmb.2015.0206
IF: 1.549
2016-01-01
Journal of Computational Biology
Abstract:The selection of relevant genes for breast cancer metastasis is critical for the treatment and prognosis of cancer patients. Although much effort has been devoted to the gene selection procedures by use of different statistical analysis methods or computational techniques, the interpretation of the variables in the resulting survival models has been limited so far. This article proposes a new Random Forest (RF)-based algorithm to identify important variables highly related with breast cancer metastasis, which is based on the important scores of two variable selection algorithms, including the mean decrease Gini (MDG) criteria of Random Forest and the GeneRank algorithm with protein-protein interaction (PPI) information. The new gene selection algorithm can be called PPIRF. The improved prediction accuracy fully illustrated the reliability and high interpretability of gene list selected by the PPIRF approach.
What problem does this paper attempt to address?