Genome-wide Predicting Disease-Related Protein Complexes by Walking on the Heterogeneous Network Based on Data Integration and Laplacian Normalization

Zhiming Liu,Jiawei Luo
DOI: https://doi.org/10.1016/j.compbiolchem.2017.04.007
IF: 3.737
2017-01-01
Computational Biology and Chemistry
Abstract:Background: Associating protein complexes to human inherited diseases is critical for better understanding of biological processes and functional mechanisms of the disease. Many protein complexes have been identified and functionally annotated by computational and purification methods so far, however, the particular roles they were playing in causing disease have not yet been well determined.Results: In this study, we present a novel method to identify associations between protein complexes and diseases. First, we construct a disease-protein heterogeneous network based on data integration and laplacian normalization. Second, we apply a random walk with restart on heterogeneous network (RWRH) algorithm on this network to quantify the strength of the association between proteins and the query disease. Third, we sum over the scores of member proteins to obtain a summary score for each candidate protein complex, and then rank all candidate protein complexes according to their scores. With a series of leave-one-out cross-validation experiments, we found that our method not only possesses high performance but also demonstrates robustness regarding the parameters and the network structure. We test our approach with breast cancer and select top 20 highly ranked protein complexes, 17 of the selected protein complexes are evidenced to be connected with breast cancer.Conclusions: Our proposed method is effective in identifying disease-related protein complexes based on data integration and laplacian normalization. (C) 2017 Published by Elsevier Ltd.
What problem does this paper attempt to address?