SVM-based Method for Predicting Enzyme Function in a Hierarchical Context

Yong-Cui Wang,Zhi-Xia Yang,Nai-Yang Deng
2010-01-01
Abstract:Automatically categorizing enzyme into the Enzyme Commission (EC) hierarchy is crucial to understand its specific molecular mechanism. Standard machine learning methods like support vector machine (SVM) and naive bayesian classifier have been successfully applied for this task. However, they treat each functional class independently, and ignore the inter-class relationships. In this paper, we develop a SVM-based method for prediction of enzyme function into the EC hierarchical context. Our method with low computational complexity is a modified version of a structured predictive model-Hierarchical Max-Margin Markov algorithm (HM3). HM3, which is specially designed for the hierarchical multi-label classification, has been successfully used in many structured pattern recognition problems, such as document categorization, web contend classification, and enzyme function prediction. As input features for our predictive model, we use the conjoint triad feature (CTF). Our method has been validated on an enzyme benchmark dataset, the proteins in this benchmark dataset have less than 40% sequence identity to any other in a same functional class. Finally, for the first three EC digits, the predictive accuracy and the Matthew's correlation coefficient (MCC) of our method range from 78% to 100% and 0.76 to 1 respectively. Therefore we think our new method will be useful supplementary tools for the future studies in enzyme function prediction.
What problem does this paper attempt to address?