Protein Fold Recognition Based on Sparse Representation Based Classification.

Ke Yan,Yong Xu,Xiaozhao Fang,Chunhou Zheng,Bin Liu
DOI: https://doi.org/10.1016/j.artmed.2017.03.006
IF: 7.011
2017-01-01
Artificial Intelligence in Medicine
Abstract:Knowledge of protein fold type is critical for determining the protein structure and function. Because of its importance, several computational methods for fold recognition have been proposed. Most of them are based on well-known machine learning techniques, such as Support Vector Machines (SVMs), Artificial Neural Network (ANN), etc. Although these machine learning methods play a role in stimulating the development of this important area, new techniques are still needed to further improve the predictive performance for fold recognition. Sparse Representation based Classification (SRC) has been widely used in image processing, and shows better performance than other related machine learning methods. In this study, we apply the SRC to solve the protein fold recognition problem. Experimental results on a widely used benchmark dataset show that the proposed method is able to improve the performance of some basic classifiers and three state-of-the-art methods to feature selection, including autocross-covariance (ACC) fold, D-D, and Bi-gram. Finally, we propose a novel computational predictor called MF-SRC for fold recognition by combining these three features into the framework of SRC to achieve further performance improvement. Compared with other computational methods in this field on DD dataset, EDD dataset and TG dataset, the proposed method achieves stable performance by reducing the influence of the noise in the dataset. It is anticipated that the proposed predictor may become a useful high throughput tool for large-scale fold recognition or at least, play a complementary role to the existing predictors in this regard.
What problem does this paper attempt to address?