Prioritizing Human Disease Genes by Multiple Data Integration

Bolin Chen,Jianxin Wang,Fang-Xiang Wu
DOI: https://doi.org/10.1109/bibm.2013.6732576
2013-01-01
Abstract:Now multiple types of data are available for prioritizing human disease genes, including gene-disease associations, disease phenotype similarities, locations of genes or their corresponding proteins in biological networks, etc. Integrating multiple types of data is expected to be effective for prioritizing human disease genes. In this paper, we propose a multiple data integration method based on the theory of Markov Random Field (MRF) and the method of Bayesian analysis for prioritizing human disease genes. The proposed method is not only flexible in easily incorporating different kinds of data, but also reliable in predicting candidate disease genes. Numerical experiments are carried out by integrating known gene-disease associations, protein complexes, protein-protein interactions and gene expression profiles. Predictions are evaluated by both the leave-one-out method and the fold enrichment method. The sensitivity and the specificity can reach at roughly 80% simultaneously. The method achieves 56.02-fold enrichment on average when integrating all those biological data in our experiments.
What problem does this paper attempt to address?