Versatile algorithm for protein bond discovery: NOS linkages as a case study

Sophia Bazzi,Sharareh Sayyad
DOI: https://doi.org/10.26434/chemrxiv-2024-7rkpf
2024-09-17
Abstract:Characterizing and unveiling chemical bonds and interactions in proteins are essential due to their impact on deepening our understanding of these systems and their accompanying applications, e.g., in drug design and protein engineering. Despite the significant impact of such investigations, a systematic framework for such studies has been lacking. Here, we present a machine-learning-based approach for discovering chemical connections in proteins. This method enables the identification of effective descriptors and the prediction of atomic constructs that host specific chemical bonds. To demonstrate the applicability of our approach, we integrate our predictive modeling method with experimental observations for covalent nitrogen-oxygen-sulfur (NOS) linkages between lysine and cysteine. Analyzing over 86,000 protein structures and their X-ray validation reports, we have unveiled sixty-nine NOS linkages beyond the previously known lysine-cysteine cases for lysine-cysteine, glycine-cysteine, and arginine-cysteine pairs. Our proposed method is easily adaptable to characterize any chemical bond or interaction, opening the way to the discovery of various chemical connections within protein structures.
Chemistry
What problem does this paper attempt to address?
The paper attempts to address the problem of systematically discovering chemical bonds in protein structures, particularly nitrogen-oxygen-sulfur (NOS) linkages. Specifically, the researchers propose a machine learning-based approach to identify specific chemical bonds in proteins. The main issues the paper aims to solve are as follows: 1. **Lack of a systematic framework**: Currently, there is a lack of a systematic method to discover chemical bonds in proteins, especially when experimental techniques are limited, making it difficult to obtain detailed information on all residues and their interactions in protein structures. 2. **Limitations of experimental techniques**: Experimental techniques such as X-ray crystallography, although powerful, rely on assumptions of known chemical bonds when converting electron density maps into atomic models. This can lead to the neglect or mis-modeling of unknown chemical bonds. 3. **Discovery of new chemical bonds**: By analyzing a large amount of protein structure data, researchers have discovered other types of NOS linkages besides lysine-cysteine, including glycine-cysteine and arginine-cysteine pairs. These findings suggest that there may be more unrecognized types of chemical bonds. By introducing a new algorithm, this method can identify effective descriptors and predict atomic configurations containing specific chemical bonds. This approach is not only applicable to NOS linkages but can also be extended to other types of chemical bonds, thereby advancing drug discovery and protein engineering.