Optimality of the genetic code with respect to protein stability and amino acid frequencies

Dimitri Gilis,Serge Massar,Nicolas Cerf,Marianne Rooman
DOI: https://doi.org/10.48550/arXiv.physics/0102044
2001-02-15
Abstract:How robust is the natural genetic code with respect to mistranslation errors? It has long been known that the genetic code is very efficient in limiting the effect of point mutation. A misread codon will commonly code either for the same amino acid or for a similar one in terms of its biochemical properties, so the structure and function of the coded protein remain relatively unaltered. Previous studies have attempted to address this question more quantitatively, namely by statistically estimating the fraction of randomly generated codes that do better than the genetic code regarding its overall robustness. In this paper, we extend these results by investigating the role of amino acid frequencies in the optimality of the genetic code. When measuring the relative fitness of the natural code with respect to a random code, it is indeed natural to assume that a translation error affecting a frequent amino acid is less favorable than that of a rare one, at equal mutation cost. We find that taking the amino acid frequency into account accordingly decreases the fraction of random codes that beat the natural code, making the latter comparatively even more robust. This effect is particularly pronounced when more refined measures of the amino acid substitution cost are used than hydrophobicity. To show this, we devise a new cost function by evaluating with computer experiments the change in folding free energy caused by all possible single-site mutations in a set of known protein structures. With this cost function, we estimate that of the order of one random code out of 100 millions is more fit than the natural code when taking amino acid frequencies into account. The genetic code seems therefore structured so as to minimize the consequences of translation errors on the 3D structure and stability of proteins.
Biological Physics,Quantitative Biology
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is to evaluate the robustness of the natural genetic code in terms of translation errors (i.e., errors that occur during transcription and translation), and to explore the role of amino acid frequencies in the optimization of the genetic code. Specifically, the researchers hope to more accurately measure the superiority of the natural genetic code over randomly generated genetic codes by introducing the parameter of amino acid frequencies. ### Main problem decomposition 1. **Robustness of the genetic code**: - The researchers hope to quantify the robustness of the natural genetic code in the face of translation errors. Specifically, they are concerned with the impact on protein structure and function when a codon is misread as another codon and the resulting change in the encoded amino acid. - Early studies have shown that the natural genetic code is very effective in limiting the effects of point mutations, but these studies usually assume that all point mutations occur with the same frequency. 2. **Role of amino acid frequencies**: - The paper further explores the importance of amino acid frequencies in the optimization of the genetic code. The author believes that different amino acids have different frequencies in proteins, so these frequencies should be considered when evaluating the robustness of the genetic code. - Specifically, if a frequently occurring amino acid is erroneously replaced, the impact may be more severe than the replacement of a rare amino acid. Therefore, incorporating amino acid frequencies into the evaluation criteria can more accurately reflect the degree of optimization of the genetic code. 3. **Improved fitness function**: - To more precisely evaluate the robustness of the genetic code, the author proposes a new fitness function \( \Phi_{\text{faa}} \), which not only considers the cost of single - base changes but also combines amino acid frequencies. - By comparing the results using different cost functions (such as the hydrophobicity - based cost function \( g_{\text{hydro}} \) and the protein - stability - based cost function \( g_{\text{mutate}} \)), the author finds that the natural genetic code is more superior when amino acid frequencies are considered. ### Conclusion By introducing amino acid frequencies and using the improved fitness function \( \Phi_{\text{faa}} \), the researchers find that the natural genetic code is more optimized in limiting the impact of translation errors on the three - dimensional structure and stability of proteins. Specifically, only about \( 10^{-8} \) of randomly generated genetic codes are superior to the natural genetic code. This indicates that the natural genetic code has been optimized to a fairly high degree during evolution, although it may not be completely optimal. In addition, the study also explores the causal relationship between amino acid frequencies and the genetic code, and proposes the view that the genetic code and amino acid frequencies may have co - evolved. In summary, this paper reveals the robustness and optimization mechanism of the natural genetic code in the face of translation errors through quantitative analysis.