Sequence-based Predictor of ATP-binding Residues Using Random Forest and Mrmr-Ifs Feature Selection.

Xin Ma,Xiao Sun
DOI: https://doi.org/10.1016/j.jtbi.2014.06.037
IF: 2.405
2014-01-01
Journal of Theoretical Biology
Abstract:We develop a computational and statistical approach (ATPBR) for predicting ATP-binding residues in proteins from amino acid sequences by using random forests with a novel hybrid feature. The hybrid feature incorporates a new feature called PSSMPP, the predicted secondary structure and orthogonal binary vectors. The mRMR-IFS feature selection method is utilized to construct the best prediction model. At last, ATPBR achieves significantly improved performance over existing methods, with 87.53% accuracy and a Matthew׳s correlation coefficient of 0.554. In addition, our further analysis demonstrates that PSSMPP distinguishes more effectively between ATP-binding and non-binding residues. Besides, the optimal features selected by the mRMR-IFS method improve the prediction performance and may provide useful insights for revealing the mechanisms of ATP and proteins interactions.
What problem does this paper attempt to address?