Proteomics Standards Initiative's ProForma 2.0: Unifying the Encoding of Proteoforms and Peptidoforms

Richard D LeDuc,Eric W Deutsch,Pierre-Alain Binz,Ryan T Fellers,Anthony J Cesnik,Joshua A Klein,Tim Van Den Bossche,Ralf Gabriels,Arshika Yalavarthi,Yasset Perez-Riverol,Jeremy Carver,Wout Bittremieux,Shin Kawano,Benjamin Pullman,Nuno Bandeira,Neil L Kelleher,Paul M Thomas,Juan Antonio Vizcaíno
DOI: https://doi.org/10.1021/acs.jproteome.1c00771
2022-04-01
Abstract:It is important for the proteomics community to have a standardized manner to represent all possible variations of a protein or peptide primary sequence, including natural, chemically induced, and artifactual modifications. The Human Proteome Organization Proteomics Standards Initiative in collaboration with several members of the Consortium for Top-Down Proteomics (CTDP) has developed a standard notation called ProForma 2.0, which is a substantial extension of the original ProForma notation developed by the CTDP. ProForma 2.0 aims to unify the representation of proteoforms and peptidoforms. ProForma 2.0 supports use cases needed for bottom-up and middle-/top-down proteomics approaches and allows the encoding of highly modified proteins and peptides using a human- and machine-readable string. ProForma 2.0 can be used to represent protein modifications in a specified or ambiguous location, designated by mass shifts, chemical formulas, or controlled vocabulary terms, including cross-links (natural and chemical) and atomic isotopes. Notational conventions are based on public controlled vocabularies and ontologies. The most up-to-date full specification document and information about software implementations are available at http://psidev.info/proforma.
What problem does this paper attempt to address?