Selection originating from protein stability/foldability: Relationships between protein folding free energy, sequence ensemble, and fitness
Sanzo Miyazawa
DOI: https://doi.org/10.1016/j.jtbi.2017.08.018
2024-08-19
Abstract:Assuming that mutation and fixation processes are reversible Markov processes, we prove that the equilibrium ensemble of sequences obeys a Boltzmann distribution with $\exp(4N_e m(1 - 1/(2N)))$, where $m$ is Malthusian fitness and $N_e$ and $N$ are effective and actual population sizes. On the other hand, the probability distribution of sequences with maximum entropy that satisfies a given amino acid composition at each site and a given pairwise amino acid frequency at each site pair is a Boltzmann distribution with $\exp(-\psi_N)$, where $\psi_N$ is represented as the sum of one body and pairwise potentials. A protein folding theory indicates that homologous sequences obey a canonical ensemble characterized by $\exp(-\Delta G_{ND}/k_B T_s)$ or by $\exp(- G_{N}/k_B T_s)$ if an amino acid composition is kept constant, where $\Delta G_{ND} \equiv G_N - G_D$, $G_N$ and $G_D$ are the native and denatured free energies, and $T_s$ is selective temperature. Thus, $4N_e m (1 - 1/(2N))$, $-\Delta \psi_{ND}$, and $-\Delta G_{ND}/k_B T_s$ must be equivalent to each other. Based on the analysis of the changes ($\Delta \psi_N$) of $\psi_N$ due to single nucleotide nonsynonymous substitutions, $T_s$, and then glass transition temperature $T_g$, and $\Delta G_{ND}$ are estimated with reasonable values for 14 protein domains. In addition, approximating the probability density function (PDF) of $\Delta \psi_N$ by a log-normal distribution, PDFs of $\Delta \psi_N$ and $K_a/K_s$, which is the ratio of nonsynonymous to synonymous substitution rate per site, in all and in fixed mutants are estimated. It is confirmed that $T_s$ negatively correlates with the mean of $K_a/K_s$. Stabilizing mutations are significantly fixed by positive selection, and balance with destabilizing mutations fixed by random drift. Supporting the nearly neutral theory, neutral selection is not significant.
Populations and Evolution,Biomolecules