Abstract:BACKGROUND:One of the major goals in gene and protein expression profiling of cancer is to identify biomarkers and build classification models for prediction of disease prognosis or treatment response. Many traditional statistical methods, based on microarray gene expression data alone and individual genes' discriminatory power, often fail to identify biologically meaningful biomarkers thus resulting in poor prediction performance across data sets. Nonetheless, the variables in multivariable classifiers should synergistically interact to produce more effective classifiers than individual biomarkers.RESULTS:We developed an integrated approach, namely network-constrained support vector machine (netSVM), for cancer biomarker identification with an improved prediction performance. The netSVM approach is specifically designed for network biomarker identification by integrating gene expression data and protein-protein interaction data. We first evaluated the effectiveness of netSVM using simulation studies, demonstrating its improved performance over state-of-the-art network-based methods and gene-based methods for network biomarker identification. We then applied the netSVM approach to two breast cancer data sets to identify prognostic signatures for prediction of breast cancer metastasis. The experimental results show that: (1) network biomarkers identified by netSVM are highly enriched in biological pathways associated with cancer progression; (2) prediction performance is much improved when tested across different data sets. Specifically, many genes related to apoptosis, cell cycle, and cell proliferation, which are hallmark signatures of breast cancer metastasis, were identified by the netSVM approach. More importantly, several novel hub genes, biologically important with many interactions in PPI network but often showing little change in expression as compared with their downstream genes, were also identified as network biomarkers; the genes were enriched in signaling pathways such as TGF-beta signaling pathway, MAPK signaling pathway, and JAK-STAT signaling pathway. These signaling pathways may provide new insight to the underlying mechanism of breast cancer metastasis.CONCLUSIONS:We have developed a network-based approach for cancer biomarker identification, netSVM, resulting in an improved prediction performance with network biomarkers. We have applied the netSVM approach to breast cancer gene expression data to predict metastasis in patients. Network biomarkers identified by netSVM reveal potential signaling pathways associated with breast cancer metastasis, and help improve the prediction performance across independent data sets.

Identifying the Gene Signatures from Gene-Pathway Bipartite Network Guarantees the Robust Model Performance on Predicting the Cancer Prognosis.

Identifying Oncogenes As Features for Clinical Cancer Prognosis by Bayesian Nonparametric Variable Selection Algorithm

Enhancing Cancer Driver Gene Prediction by Protein-Protein Interaction Network

An Improved Method for Prediction of Cancer Prognosis by Network Learning

Network-Based Inference Framework for Identifying Cancer Genes from Gene Expression Data

Incorporating gene co-expression network in identification of cancer prognosis markers

Extracting a Few Functionally Reproducible Biomarkers to Build Robust Subnetwork-Based Classifiers for the Diagnosis of Cancer.

Integrating Biological Knowledge with Gene Expression Profiles for Survival Prediction of Cancer

Identifying Cancer Biomarkers by Network-Constrained Support Vector Machines.

Identifying Cancer Prognostic Modules by Module Network Analysis

Ensemble Classifier Based on Gene Synergistic Network Improves Breast Cancer Outcome Prediction

Uncovering the Prognostic Gene Signatures for the Improvement of Risk Stratification in Cancers by Using Deep Learning Algorithm Coupled with Wavelet Transform

Integrative Analysis Based on Survival Associated Co-Expression Gene Modules for Predicting Neuroblastoma Patients' Survival Time

Identification of cancer prognosis-associated functional modules using differential co-expression networks.

Importance of gene expression signatures in pancreatic cancer prognosis and the establishment of a prediction model

Protein interaction network underpins concordant prognosis among heterogeneous breast cancer signatures

Integrated Analysis of miRNA and Gene Networks on Cancer Expression Data

Finding disagreement pathway signatures and constructing an ensemble model for cancer classification

Prognostic Gene Expression Signature Revealed the Involvement of Mutational Pathways in Cancer Genome

Identifying Dysregulated Pathways in Cancers from Pathway Interaction Networks