A Review of Protein Inference

Julian Uszkoreit,Katrin Marcus,Martin Eisenacher
DOI: https://doi.org/10.1007/978-1-0716-4152-1_4
Abstract:Protein inference is an often neglected though crucial step in most proteomic experiments. In the bottom-up proteomic approach, the actual molecules of interest, the proteins, are digested into peptides before measurement on a mass spectrometer. This approach introduces a loss of information: The actual proteins must be inferred based on the identified peptides. While this might seem trivial, there are certain problems, one of the biggest being the presence of peptides that are shared among proteins. These amino acid sequences can, based on the database used for identification, belong to more than one protein. If such peptides are identified in a sample, it cannot be said which proteins actually were in the sample, but only an estimate on the most probable proteins or protein groups can be given based on a predefined inference strategy.Here we describe the effect of the chosen database for peptide identification on the number of shared peptides. Afterward, the mainly used protein inference methods will be sketched, and the necessity of stringent false discovery rate on peptide and protein level is discussed. Finally, we explain how the tool "PIA or protein inference algorithms" can be used together with the workflow environment KNIME and OpenMS to perform protein inference in a common proteomic experiment.
What problem does this paper attempt to address?