Enrichment-Based Proteogenomics Identifies Microproteins, Missing Proteins, and Novel Smorfs in Saccharomyces Cerevisiae.

Cuitong He,Chenxi Jia,Yao Zhang,Ping Xu
DOI: https://doi.org/10.1021/acs.jproteome.8b00032
2018-01-01
Journal of Proteome Research
Abstract:Microproteins are peptides composed of 100 amino acids (AA) or fewer, encoded by small open reading frames (smORFs). It has been demonstrated that microproteins participate in and regulate a wide range of functions in cells. However, the annotation and identification of microproteins is challenging in part owing to their low molecular weight, low abundancy, and hydrophobicity. These factors have led to the unannotation of smORFs in genome processing and have made their identification at the protein level difficult. Large-scale enrichment of microproteins in proteogenomics has made it possible to efficiently identify microproteins and discover unannotated smORFs in Saccharomyces cerevisiae. We integrated four microprotein-specific enrichment strategies to enhance coverage. We identified 117 microproteins, verified 31 missing proteins (MPs), and discovered 3 novel smORFs. In total, 31 proteins were confirmed as MPs by spectrum quality checking. Three novel smORFs (YKL104W-A, YHR052C-B, and YHR054C-B) were reserved after spectrum quality checking, peptide synthesizing, homologue matching, and so on. This study not only demonstrates that there are potential smORF candidates to be annotated in an extensively studied organism but also presents an efficient strategy for the discovery of small MPs. All MS data sets have been deposited to the ProteomeXchange with identifier PXD008586.
What problem does this paper attempt to address?