Protein Function Prediction Using Dependence Maximization.

Guo-Xian Yu,Carlotta Domeniconi,Huzefa Rangwala,Guoji Zhang
DOI: https://doi.org/10.1007/978-3-642-40988-2_37
2013-01-01
Abstract:Protein function prediction is one of the fundamental tasks in the post genomic era. The vast amount of available proteomic data makes it possible to computationally annotate proteins. Most computational approaches predict protein functions by using the labeled proteins and assuming that the annotation of labeled proteins is complete, and without any missing functions. However, partially annotated proteins are common in real-world scenarios, that is a protein may have some confirmed functions, and whether it has other functions is unknown. In this paper, we make use of partially annotated proteomic data, and propose an approach called Pro tein Function Prediction using D ependency M aximization (ProDM). ProDM works by leveraging the correlation between different function labels, the 'guilt by association' rule between proteins, and maximizes the dependency between function labels and feature expression of proteins. ProDM can replenish the missing functions of partially annotated proteins (a seldom studied problem), and can predict functions for completely unlabeled proteins using partially annotated ones. An empirical study on publicly available protein-protein interaction (PPI) networks shows that, when the number of missing functions is large, ProDM performs significantly better than other related methods with respect to various evaluation criteria.
What problem does this paper attempt to address?