Integer Programming Model for Automated Structure-based NMR Assignment

Richard Jang,Xin Gao,David R. Cheriton
2009-01-01
Abstract:We introduce the “Automated Structure-based Assignment" problem: Given a reference 3D structure, a protein sequence, and its NMR spectra, automatically interpret the NMR spectra and do backbone resonance assignment. We then propose a solution to solve this problem. The core of the solution is a novel integer linear programming model, which is a general framework for many versions of the structure-based assignment problem. As a proof of concept, our system has generated an automatic assignment on a real protein TM1112 with 91% recall and 99% precision, starting from scratch. When we restrict ourselves to the special case where perfect peak lists are given, we are able to compare our results with existing results in the field. In particular, we reduced the assignment error of Xiong-Pandurangan-Bailey-Kellogg’s method by 5 folds on average, with over a thousand fold speed up. Our system also achieves 91% assignment accuracy on real experimental data for Ubiquitin. These results have direct practical implications. For example, in the protein design process, a protein is modified slightly and its structure is again measured by NMR experiments. Our method automates this process, saving time on tedious peak-picking and resonance assignment. As another example, when there is a homologous protein with known structure, our method increases the assignment accuracy and hence enables automated NMR structure determination. ? The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors. ?? All correspondence should be addressed to mli@uwaterloo.ca.
What problem does this paper attempt to address?