[Microbial metaproteomics--From sample processing to data acquisition and analysis]

En-Hui Wu,Liang Qiao
DOI: https://doi.org/10.3724/SP.J.1123.2024.02009
Abstract:Microorganisms are closely associated with human diseases and health. Understanding the composition and function of microbial communities requires extensive research. Metaproteomics has recently become an important method for throughout and in-depth study of microorganisms. However, major challenges in terms of sample processing, mass spectrometric data acquisition, and data analysis limit the development of metaproteomics owing to the complexity and high heterogeneity of microbial community samples. In metaproteomic analysis, optimizing the preprocessing method for different types of samples and adopting different microbial isolation, enrichment, extraction, and lysis schemes are often necessary. Similar to those for single-species proteomics, the mass spectrometric data acquisition modes for metaproteomics include data-dependent acquisition (DDA) and data-independent acquisition (DIA). DIA can collect comprehensive peptide information from a sample and holds great potential for future development. However, data analysis for DIA is challenged by the complexity of metaproteome samples, which hinders the deeper coverage of metaproteomes. The most important step in data analysis is the construction of a protein sequence database. The size and completeness of the database strongly influence not only the number of identifications, but also analyses at the species and functional levels. The current gold standard for metaproteome database construction is the metagenomic sequencing-based protein sequence database. A public database-filtering method based on an iterative database search has been proven to have strong practical value. The peptide-centric DIA data analysis method is a mainstream data analysis strategy. The development of deep learning and artificial intelligence will greatly promote the accuracy, coverage, and speed of metaproteomic analysis. In terms of downstream bioinformatics analysis, a series of annotation tools that can perform species annotation at the protein, peptide, and gene levels has been developed in recent years to determine the composition of microbial communities. The functional analysis of microbial communities is a unique feature of metaproteomics compared with other omics approaches. Metaproteomics has become an important component of the multi-omics analysis of microbial communities, and has great development potential in terms of depth of coverage, sensitivity of detection, and completeness of data analysis.
What problem does this paper attempt to address?