Bayesian Data Integration and Enrichment Analysis for Predicting Gene Function in Malaria

Philip M. R. Tedder,James R. Bradford,Chris J. Needham,Glenn A. McConkey,Andrew J. Bulpitt,David R. Westhead
DOI: https://doi.org/10.1007/978-3-642-03073-4_47
2009-01-01
Abstract:Malaria is one of the world’s most deadly diseases and is caused by the parasite Plasmodium falciparum. Sixty percent of P. falciparum genes have no known function and therefore new methods of gene function prediction are needed. To address this problem, we train a naïve Bayes classifier on multiple sources of data and subsequently apply a modified version of the Gene Set Enrichment Analysis Algorithm to predict gene function in P. falciparum. To define gene function, we exploit the hierarchical structure of the Gene Ontology, specifically using the Biological Process category. We demonstrate the value of integrating multiple data sources by achieving accurate predictions on genes that cannot be annotated using simple sequence similarity based methods.
What problem does this paper attempt to address?