Abstract:Understanding how a mutation might affect protein stability is of significant importance to protein engineering and for understanding protein evolution genetic diseases. While a number of computational tools have been developed to predict the effect of missense mutations on protein stability protein stability upon mutations, they are known to exhibit large biases imparted in part by the data used to train and evaluate them. Here, we provide a comprehensive overview of predictive tools, which has provided an evolving insight into the importance and relevance of features that can discern the effects of mutations on protein stability. A diverse selection of these freely available tools was benchmarked using a large mutation-level blind dataset of 1342 experimentally characterised mutations across 130 proteins from ThermoMutDB, a second test dataset encompassing 630 experimentally characterised mutations across 39 proteins from iStable2.0 and a third blind test dataset consisting of 268 mutations in 27 proteins from the newly published ProThermDB. The performance of the methods was further evaluated with respect to the site of mutation, type of mutant residue and by ranging the pH and temperature. Additionally, the classification performance was also evaluated by classifying the mutations as stabilizing (∆∆G ≥ 0) or destabilizing (∆∆G < 0). The results reveal that the performance of the predictors is affected by the site of mutation and the type of mutant residue. Further, the results show very low performance for pH values 6-8 and temperature higher than 65 for all predictors except iStable2.0 on the S630 dataset. To illustrate how stability and structure change upon single point mutation, we considered four stabilizing, two destabilizing and two stabilizing mutations from two proteins, namely the toxin protein and bovine liver cytochrome. Overall, the results on S268, S630 and S1342 datasets show that the performance of the integrated predictors is better than the mechanistic or individual machine learning predictors. We expect that this paper will provide useful guidance for the design and development of next-generation bioinformatic tools for predicting protein stability changes upon mutations.

AI challenges for predicting the impact of mutations on protein stability

Assessing the Performance of Computational Predictors for Estimating Protein Stability Changes Upon Missense Mutations

Exploring evolution to uncover insights into protein mutational stability

Assessing computational tools for predicting protein stability changes upon missense mutations using a new dataset

Comparing Supervised Learning and Rigorous Approach for Predicting Protein Stability upon Point Mutations in Difficult Targets

Predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset

Review of predicting protein stability changes upon variations

Quantification of biases in predictions of protein–protein binding affinity changes upon mutations

BayeStab: Predicting Effects of Mutations on Protein Stability with Uncertainty Quantification

Predicting protein thermal stability changes upon single and multi-point mutations via restricted attention subgraph neural network

Protein stability models fail to capture epistatic interactions of double point mutations

Using AlphaFold to predict the impact of single mutations on protein stability and function

Comparison and evaluation of data-driven protein stability prediction models

A three-state prediction of single point mutations on protein stability changes

Robust Prediction of Mutation-Induced Protein Stability Change by Property Encoding of Amino Acids.

Predicting Protein Thermostability Upon Mutation Using Molecular Dynamics Timeseries Data

Predicting a Protein's Stability under a Million Mutations

Three Simple Properties Explain Protein Stability Change upon Mutation

Improved prediction of stabilizing mutations in proteins by incorporation of mutational effects on ligand binding

Generative AI impact on protein stability prediction in breast cancer genes