Rapid Quantification of Protein Secondary Structure Composition from a Single Unassigned 1D 13C NMR Spectrum

Haote Li,Marcus Tuttle,Kurt Zilm,Victor Batista
DOI: https://doi.org/10.26434/chemrxiv-2024-qt9g4
2024-06-12
Abstract:The function of a protein is predicated upon its three dimensional fold. Representing its complex structure as a series of repeating secondary structural elements is one of the most useful ways by which we study, characterize, and visualize a protein. Consequently, experimental methods that quantify the secondary structure content allow us to connect a protein’s structure to its function. Here, we introduce an automated gradient descent-based method we refer to as Secondary Structure Distribution by NMR that allows for rapid quantification of the protein secondary structure composition of a protein from a single, 1D 13C NMR spectrum without chemical shift assignments. The analysis of nearly 900 proteins with known structure and chemical shifts demonstrates the capabilities of our approach. We show that these results rival alternative techniques such as FT-IR and circular dichroism that are commonly used to estimate secondary structure compositions. The resulting method requires only the primary sequence of the protein and its referenced 13C NMR spectrum. Each residue is modeled in an ensemble of secondary structures with percentage contributions from random coil, α-helix, and β-sheet secondary structures obtained by minimizing the difference between a simulated and experimental 1D 13C NMR spectrum. The capabilities of the method are demonstrated as applied to samples at natural abundance or enriched in 13C, acquired by either solution or solid-state NMR, and even on low magnetic field benchtop NMR spectrometers. This approach allows for rapid characterization of protein secondary structure across traditionally challenging to characterize states including liquid-liquid phase-separated, membrane-bound, or aggregated states.
Chemistry
What problem does this paper attempt to address?
The paper aims to address the issue of rapid quantification of protein secondary structure composition, proposing a new method based on Nuclear Magnetic Resonance (NMR). Traditionally, determining the secondary structure of proteins requires techniques such as Circular Dichroism (CD) and Fourier Transform Infrared Spectroscopy (FT-IR), or relies on time-consuming NMR chemical shift assignments. However, these methods face challenges when dealing with proteins in liquid-liquid phase separation, membrane-bound, or aggregated states. The method introduced in this article, called "Secondary Structure Distribution by NMR (SSD-NMR)," can automatically infer the secondary structure composition of proteins from a single unassigned 1D 13C NMR spectrum, without the need for chemical shift assignments. This approach utilizes a gradient descent algorithm to fit each residue as a mixture of random coil, α-helix, and β-sheet, and optimizes parameters by minimizing the difference between simulated and experimental 1D 13C NMR spectra. This method is not only applicable to proteins in solution but also to solid-state NMR data, and can even be analyzed on low-field benchtop NMR spectrometers, thus enabling rapid characterization of protein secondary structures under biologically relevant conditions. The paper validates the effectiveness of this method with nearly 900 protein samples with known structures and chemical shifts, demonstrating that its results are comparable to commonly used techniques such as CD and FT-IR.