Abstract:Motivation: Backbone resonance assignment is a critical bottleneck in studies of protein structure, dynamics and interactions by nuclear magnetic resonance (NMR) spectroscopy. A minimalist approach to assignment, which we call 'contact-based', seeks to dramatically reduce experimental time and expense by replacing the standard suite of through-bond experiments with the through-space (nuclear Overhauser enhancement spectroscopy, NOESY) experiment. In the contact-based approach, spectral data are represented in a graph with vertices for putative residues (of unknown relation to the primary sequence) and edges for hypothesized NOESY interactions, such that observed spectral peaks could be explained if the residues were 'close enough'. Due to experimental ambiguity, several incorrect edges can be hypothesized for each spectral peak. An assignment is derived by identifying consistent patterns of edges (e.g. for alpha-helices and beta-sheets) within a graph and by mapping the vertices to the primary sequence. The key algorithmic challenge is to be able to uncover these patterns even when they are obscured by significant noise. Results: This paper develops, analyzes and applies a novel algorithm for the identification of polytopes representing consistent patterns of edges in a corrupted NOESY graph. Our randomized algorithm aggregates simplices into polytopes and fixes inconsistencies with simple local modifications, called rotations, that maintain most of the structure already uncovered. In characterizing the effects of experimental noise, we employ an NMR-specific random graph model in proving that our algorithm gives optimal performance in expected polynomial time, even when the input graph is significantly corrupted. We confirm this analysis in simulation studies with graphs corrupted by up to 500% noise. Finally, we demonstrate the practical application of the algorithm on several experimental beta-sheet datasets. Our approach is able to eliminate a large majority of noise edges and to uncover large consistent sets of interactions. Availability: Our algorithm has been implemented in the platform-independent Python code. The software can be freely obtained for academic use by request from the authors.

Towards Fully Automated Structure-Based NMR Resonance Assignment of ¹⁵N-Labeled Proteins from Automatically Picked Peaks.

Towards automated structure-based NMR resonance assignment

Integer Programming Model for Automated Structure-based NMR Assignment

Error tolerant NMR backbone resonance assignment and automated structure generation.

An Automated Framework for NMR Resonance Assignment Through Simultaneous Slice Picking and Spin System Forming.

Towards Automating Protein Structure Determination from NMR Data

Combining Automated Peak Tracking in SAR by NMR with Structure-Based Backbone Assignment from 15N-NOESY

Automated Assignment of Backbone Resonances Using Residual Dipolar Couplings Acquired from a Protein with Known Structure

Automated probabilistic method for assigning backbone resonances of (13C,15N)-labeled proteins

Combining ambiguous chemical shift mapping with structure-based backbone and NOE assignment from 15N-NOESY

Towards Automatic Protein Backbone Assignment Using Proton-Detected 4D Solid-State NMR Data

Towards fully automated protein structure elucidation with NMR spectroscopy

Exploiting image registration for automated resonance assignment in NMR

Automated Assignment in Selectively Methyl-Labeled Proteins.

Protein NMR assignment by isotope pattern recognition

A new method on reconstructing protein structure from NOESY distances

ASAP: An automatic sequential assignment program for congested multidimensional solid state NMR spectra

NMR Backbone Assignment of Large Proteins by Using (13) Cα -Only Triple-Resonance Experiments.

Automated Protein-Structure Elucidation from 2d Nmr

An efficient randomized algorithm for contact-based NMR backbone resonance assignment

Protein Structure Determination Using a Riemannian Approach