Improved prediction of site‐rates from structure with averaging across homologs

Christoffer Norn,Fábio Oliveira,Ingemar André
DOI: https://doi.org/10.1002/pro.5086
IF: 8
2024-06-27
Protein Science
Abstract:Variation in mutation rates at sites in proteins can largely be understood by the constraint that proteins must fold into stable structures. Models that calculate site‐specific rates based on protein structure and a thermodynamic stability model have shown a significant but modest ability to predict empirical site‐specific rates calculated from sequence. Models that use detailed atomistic models of protein energetics do not outperform simpler approaches using packing density. We demonstrate that a fundamental reason for this is that empirical site‐specific rates are the result of the average effect of many different microenvironments in a phylogeny. By analyzing the results of evolutionary dynamics simulations, we show how averaging site‐specific rates across many extant protein structures can lead to correct recovery of site‐rate prediction. This result is also demonstrated in natural protein sequences and experimental structures. Using predicted structures, we demonstrate that atomistic models can improve upon contact density metrics in predicting site‐specific rates from a structure. The results give fundamental insights into the factors governing the distribution of site‐specific rates in protein families.
biochemistry & molecular biology
What problem does this paper attempt to address?