The Strategies and Challenges in Metaproteomics Bioinformatics

Xu Hong-Kai,Yan Ke-Qiang,He Yan-Bin,Wen Bo,Yang Huan-Ming,Liu Si-Qi
DOI: https://doi.org/10.16476/j.pibb.2017.0187
2018-01-01
PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS
Abstract:Metaproteomics is a new frontier of microbiological science that collects the proteomic data from microbes in nature using mass spectrometry and explores the corresponding genetic and biochemical mechanisms with systematical bioinformatics. In contrast to the traditional approach, metaproteomic infonuatics adopts new strategies, including algorithms, databases and searches. As the metaproteomic samples generally contain very complicated protein components, a large dataset with all the potential microbe genomes is basically required for searching peptides based on the signals of mass spectrometry, while such searching process is real time-consuming. Several considerable factors such as dataset capacity, searching strategy and false positive control, therefore, have to be carefully evaluated to achieve the better results of protein identification with an acceptable accuracy and efficiency. Meanwhile, except a common sequence merger in proteomic informatics, metaproteomics has to deal with the issues of vast sequence homologous and species grouping. Solving these problems relies on effective utilization to the public information gained from NCBI for species classification, and filtration treatment from sequence to species using LCA algorithm. Herein, we briefly introduce this field, including which is the basic informatics strategy of metaproteomics, what are the tough challenges in metaproteomic informatics, and how the technique difficulties are being solved in future.
What problem does this paper attempt to address?