Mass-Spectrometry-Based Near-Complete Draft of Thesaccharomyces Cerevisiaeproteome

Yuan Gao,Lingyan Ping,Duc Duong,Chengpu Zhang,Eric B. Dammer,Yanchang Li,Peiru Chen,Lei Chang,Huiying Gao,Junzhu Wu,Ping Xu
DOI: https://doi.org/10.1101/2020.06.24.168526
2021-01-01
Journal of Proteome Research
Abstract:Proteomics approaches designed to catalogue all open reading frames (ORFs) under a defined set of growth conditions of an organism have flourished in recent years. However, no proteome has been sequenced completely so far. Here, we generate the largest yeast proteome data set, including 5610 identified proteins, using a strategy based on optimized sample preparation and high-resolution mass spectrometry. Among the 5610 identified proteins, 94.1% are core proteins, which achieves near-complete coverage of the yeast ORFs. Comprehensive analysis of missing proteins showed that proteins are missed mainly due to physical properties. A review of protein abundance shows that our proteome encompasses a uniquely broad dynamic range. Additionally, these values highly correlate with mRNA abundance, implying a high level of accuracy, sensitivity, and precision. We present examples of how the data could be used, including reannotating gene localization, providing expression evidence of pseudogenes. Our near-complete yeast proteome data set will be a useful and important resource for further systematic studies.
What problem does this paper attempt to address?