Integrating De Novo Sequencing and Database Search for Monoclonal Antibody Sequencing

Xiaoguang Lei,Baozhen Shan
2013-01-01
Journal of Biomolecular Techniques
Abstract:A mass spectrometry-based workflow for sequencing the monoclonal antibodies was presented. It combined multi-enzyme digestion, acquisition of high-resolution on both precursor and fragment steps, and integration of database search and de novo sequencing. All MS/MS spectra were first searched against a database, and the top ranking proteins were selected as protein candidates. Unidentified spectra were performed de novo sequencing. The good de novo sequences were selected for homology matching with candidate proteins. The sequence segments of protein candidates were replaced with the de novo sequences to generate new candidates until no better protein candidates could be found. A sample of human monoclonal antibody was reduced, alkylated, and digested with six different enzymes (AspN, chymotrypsin, GluC, LysC, pepsin and trypsin). Peptide mixtures were analyzed using nana-LC-MS on LTQ-Ortritrap at high resolution in the Orbitrap. Six raw data (MS/MS spectra) were analyzed with PEAKS. All MS/MS spectra were searched against a public antibody database. The proteins (IGHG1, HV102, IGKC, KV304, etc.) were selected as the candidate for further analysis. At 0.5% of false discovery rate (FDR), the majority of the constant region sequences of heavy chain (Ig gamma) and light chain (Ig kappa) can be found in current database with the coverage of 99% and 100% respectively. However, the variable domains containing complementarity determining regions (CDRs) are not complete in the database, with the coverage of 40% for the heavy chain and 36% for the light chain. By iteratively finding homology matches, full sequences of the antibodies including both constant regions and variable regions were determined with 0.5% of FDR. For example, the sequence of the variable region of the heavy chain was determined by the homology match of KFKSKATLTVDKSASTAYME from de novo sequence to KFYGRVTLTRDTSASTAYME from the database. A workflow enables an alternative approach for sequencing monoclonal antibodies.
What problem does this paper attempt to address?