Prediction of thermophilic protein using 2-D general series correlation pseudo amino acid features

Hao Wan,Yanan Zhang,Shibo Huang
DOI: https://doi.org/10.1016/j.ymeth.2023.08.012
IF: 4.647
2023-10-01
Methods
Abstract:The demand for thermophilic protein has been increasing in protein engineering recently. Many machine-learning methods for identifying thermophilic proteins have emerged during this period. However, most machine learning-based thermophilic protein identification studies have only focused on accuracy. The relationship between the features' meaning and the proteins' physicochemical properties has yet to be studied in depth. In this article, we focused on the relationship between the features and the thermal stability of thermophilic proteins. This method used 2-D general series correlation pseudo amino acid (SC-PseAAC-General) features and realized accuracy of 82.76% using the J48 classifier. In addition, this research found the presence of higher frequencies of glutamic acid in thermophilic proteins, which help thermophilic proteins maintain their thermal stability by forming hydrogen bonds and salt bridges that prevent denaturation at high temperatures.
biochemistry & molecular biology,biochemical research methods
What problem does this paper attempt to address?