Proteomics Standards Initiatives ProForma 2.0 Unifying the encoding of Proteoforms and Peptidoforms

Richard D. LeDuc,Eric W. Deutsch,Pierre-Alain Binz,Ryan T. Fellers,Anthony J. Cesnik,Joshua A. Klein,Tim Van Den Bossche,Ralf Gabriels,Arshika Yalavarthi,Yasset Perez-Riverol,Jeremy Carver,Wout Bittremieux,Shin Kawano,Benjamin Pullman,Nuno Bandeira,Neil L. Kelleher,Paul M. Thomas,Juan Antonio Vizcaíno
DOI: https://doi.org/10.48550/arXiv.2109.11352
2022-03-22
Abstract:There is the need to represent in a standard manner all the possible variations of a protein or peptide primary sequence, including both artefactual and post-translational modifications of peptides and proteins. With that overall aim, here, the Human Proteome Organization (HUPO) Proteomics Standards Initiative (PSI) has developed a notation, called ProForma 2.0, which is a substantial extension of the original ProForma notation, developed by the Consortium for Top-Down Proteomics (CTDP). ProForma 2.0 aims to unify the representation of proteoforms and peptidoforms. Therefore, this notation supports use cases needed for bottom-up and middle/topdown proteomics approaches and allows the encoding of highly modified proteins and peptides using a human and machine-readable string. ProForma 2.0 covers encoding protein modification names and accessions, cross-linking reagents including disulfides, glycans, modifications encoded using mass shifts and/or via chemical formulas, labile and C or N-terminal modifications, ambiguity in the modification position and representation of atomic isotopes, among other use cases. Notational conventions are based on public controlled vocabularies and ontologies. Detailed information about the notation and existing implementations are available at <a class="link-external link-http" href="http://www.psidev.info/proforma" rel="external noopener nofollow">this http URL</a> and at the corresponding GitHub repository (<a class="link-external link-https" href="https://github.com/HUPO-PSI/proforma" rel="external noopener nofollow">this https URL</a>).
Biomolecules
What problem does this paper attempt to address?