Local-DPP: an Improved DNA-binding Protein Prediction Method by Exploring Local Evolutionary Information.

Leyi Wei,Jijun Tang,Quan Zou
DOI: https://doi.org/10.1016/j.ins.2016.06.026
IF: 8.1
2017-01-01
Information Sciences
Abstract:Increased knowledge of DNA-binding proteins would enhance our understanding of protein functions in cellular biological processes. To handle the explosive growth of protein sequence data, researchers have developed machine learning-based methods that quickly and accurately predict DNA-binding proteins. In recent years, the predictive accuracy of machine learning-based predictors has significantly advanced, but the predictive performance remains unsatisfactory. In this paper, we establish a novel predictor named Local-DPP, which combines the local Pse-PSSM (Pseudo Position-Specific Scoring Matrix) features with the random forest classifier. The proposed features can efficiently capture the local conservation information, together with the sequence-order information, from the evolutionary profiles (PSSMs). We evaluate and compare the Local-DPP predictor with state-of-the-art predictors on two stringent benchmark datasets (one for the jackknife test, the other for an independent test). The proposed Local-DPP significantly improved the accuracy of the existing predictors, from 77.3% to 79.2% and 76.9% to 79.0% in the jackknife and independent tests, respectively. This demonstrates the efficacy and effectiveness of Local-DPP in predicting DNA-binding proteins. The proposed Local-DPP is now freely accessible to the public through the user-friendly webserver http://server.malab.cn/Local-DPP/Index.html.
What problem does this paper attempt to address?