Automated probabilistic method for assigning backbone resonances of (13C,15N)-labeled proteins
J A Lukin,A P Gove,S N Talukdar,C Ho
DOI: https://doi.org/10.1023/a:1018602220061
Abstract:We present a computer algorithm for the automated assignment of polypeptide backbone and 13C beta resonances of a protein of known primary sequence. Input to the algorithm consists of cross peaks from several 3D NMR experiments: HNCA, HN(CA)CO, HN(CA)HA, HNCACB, COCAH, HCA(CO)N, HNCO, HN(CO)CA, HN(COCA)HA, and CBCA(CO)NH. Data from these experiments performed on glutamine-binding protein are analyzed statistically using Bayes' theorem to yield objective probability scoring functions for matching chemical shifts. Such scoring is used in the first state of the algorithm to combine cross peaks from the first five experiments to form intraresidue segments of chemical shifts (Ni,HiN,Ci alpha, Ci beta, Ci'), while the latter five are combined into interresidue segments (Ci alpha,Ci beta,Ci',Ni + 1,Hi + 1N). Given a tentative assignment of segments, the second stage of the procedure calculates probability scores based on the likelihood of matching the chemical shifts of each segment with (i) overlapping segments; and (ii) chemical shift distributions of the underlying amino acid type (and secondary structure, if known). This joint probability is maximized by rearranging segments using a simulated annealing program, optimized for efficiency. The automated assignment program was tested using CBCANH and CBCA(CO)NH cross peaks of the two previously assigned proteins, calmodulin and CheA. The agreement between the results of our method and the published assignments was excellent. Our algorithm was also applied to the observed cross peaks of glutamine-binding protein of Escherichia coli, yielding an assignment in excellent agreement with that obtained by time-consuming, manual methods. The chemical shift assignment procedure described here should be most useful for NMR studies of large proteins, which are now feasible with the use of pulsed-field gradients and random partial deuteration of samples.