InProID: an Integrated Protein Identification System

Irina Fedulova,Sergey Pevtsov,Mingwu Zhang,Zheng Ouyang,Sunil Prabhakar,Xiang Zhang
2005-01-01
Abstract:Protein identification is a major task in proteomics. Two methods are currently used to identify proteins from MS/MS data: database searching and de novo sequencing. Database searching correlates the spectra with in-silico protein sequences. The disadvantages of using a database searching strategy are very well understood (eg, prediction errors in gene-finding programs). In de novo sequencing, the peptide sequence is derived directly from the tandem MS/MS spectrum (1, 2). The advantage of de novo sequencing is that it does not rely on a protein database. It does, however, require high mass accuracy. It also cannot correctly assign peptides because of amino acid mass equivalence. In this study, the integration of these two approaches seeks to overcome these limitations InProID includes three sections: data pre-processing, de novo sequencing, and database searching. Data pre-processing removes …
What problem does this paper attempt to address?