Identification of Hormone-Binding Proteins Using a Novel Ensemble Classifier

Kuo Wang,Sumei Li,Qing Wang,Chunping Hou
DOI: https://doi.org/10.1007/s00607-018-0682-x
2018-01-01
Computing
Abstract:Hormone-binding proteins (HBPs) are important soluble carriers for growth hormones, and correct recognition of HBPs is crucial to understanding their functions. Therefore, we aimed to construct an efficient and reliable classifier to identify HBPs accurately. At first, 246 proteins were collected from UniProt database and considered as the objective benchmark dataset. We employed the 8000-dimensional feature extraction method based on tripeptide compositions to formulate protein samples. Subsequently, we alleviated the intricate feature set by utilizing ANOVA, a feature ranking technique, and acquired the optimal feature subset devoid of redundant information. Furthermore, we utilized three classification methods to process the selected tripeptide features, which generated three probability sequences. Finally, the three probability sequences were considered as new features, and addressed by the support vector machine to construct a prediction model. Results indicated that 90.6% of accuracy was achieved in five-fold cross validation, which was superior to that of other published methods.
What problem does this paper attempt to address?