What geometrically constrained folding models can tell us about real-world protein contact maps

Nora Molkenthin,J. J. Güven,Steffen Mühle,Antonia S. J. S. Mey,Antonia S.J.S. Mey
DOI: https://doi.org/10.48550/arXiv.2205.09074
2022-05-18
Biological Physics
Abstract:The mechanisms by which a protein's 3D structure can be determined based on its amino acid sequence have long been one of the key mysteries of biophysics. Often simplistic models, such as those derived from geometric constraints, capture bulk real-world 3D protein-protein properties well. One approach is using protein contact maps to better understand proteins' properties. Here, we investigate the emergent behaviour of contact maps for different geometrically constrained models and real-world protein systems. We derive an analytical approximation for the distribution of model amino acid distances, $s$, by means of a mean-field approach. This approximation is then validated for simulations using a 2D and 3D random interaction model, as well as from contact maps of real-world protein data. Using data from the RCSB Protein Data Bank (PDB) and AlphaFold~2 database, the analytical approximation is fitted to protein chain lengths of $L\approx100$, $L\approx200$, and $L\approx300$. While a universal scaling behaviour for protein chains of different lengths could not be deduced, we present evidence that the amino acid distance distributions can be attributed to geometric constraints of protein chains in bulk and amino acid sequences only play a secondary role.
What problem does this paper attempt to address?