An Efficient LightGBM Model to Predict Protein Self-interacting Using Chebyshev Moments and Bi-gram

Zhao-Hui Zhan,Zhu-Hong You,Yong Zhou,Kai Zheng,Zheng-Wei Li
DOI: https://doi.org/10.1007/978-3-030-26969-2_43
2019-01-01
Abstract:Protein self-interactions (SIPs) play significant roles in most life activities. Although numerous computational methods have been developed to predict SIPs, there is still a need of efficient and accurate techniques to improve the performance of SIPs prediction. In this paper, we proposed a machine learning scheme named LGCM for accurate SIP predictions based on protein sequence information. More specifically, an novel feature descriptor employing bi-gram and Chebyshev moments algorithm was developed with the extraction of discriminative sequence information. Then, we fed the integrated protein features into LightGBM classifier as input to train automatic LGCM model. It was demonstrated by rigorous cross-validations that the proposed approach LGCM had a superior prediction performance than other previous methods for SIP predictions with the accuracy of 96.90% and 98.29% on yeast and human datasets, respectively. Experiment results anticipated the effectiveness and reliability of LGCM and played a definite guiding role in future bioinformatics research.
What problem does this paper attempt to address?