Combining Evolutionary Information and Sparse Bayesian Probability Model to Accurately Predict Self-interacting Proteins

Yan-Bin Wang,Zhu-Hong You,Hai-cheng Yi,Zhan-Heng Chen,Zhen-Hao Guo,Kai Zheng
DOI: https://doi.org/10.1007/978-3-030-26969-2_44
2019-01-01
Abstract:Self-interacting proteins (SIPs) play a crucial role in investigation of various biochemical developments. In this work, a novel computational method was proposed for accelerating SIPs validation only using protein sequence. Firstly, the protein sequence was represented as Position-Specific Weight Matrix (PSWM) containing protein evolutionary information. Then, we incorporated the Legendre Moment (LM) and Sparse Principal Component Analysis (SPCA) to extract essential and anti-noise evolutionary feature from the PSWM. Finally, we utilized robust Probabilistic Classification Vector Machine (PCVM) classifier to carry out prediction. In the cross-validated experiment, the proposed method exhibits high accuracy performance with 95.54% accuracy on S.erevisiae dataset, which is a significant improvement compared to several competing SIPs predictors. The empirical test reveal that the proposed method can efficiently extracts salient features from protein sequences and accurately predict potential SIPs.
What problem does this paper attempt to address?