TopLib: Building and searching top-down mass spectral libraries for proteoform identification

Kun Li,Haixu Tang,Xiaowen Liu
DOI: https://doi.org/10.1101/2024.11.12.623220
2024-11-15
Abstract:Mass spectral library search is a widely used approach for spectral identification in mass spectrometry (MS)-based proteomics. While numerous methods exist for building and searching bottom-up mass spectral libraries, there is a lack of software tools for top-down mass spectral libraries. To fill the gap, we introduce TopLib, the first software package designed for building and searching top-down spectral libraries. TopLib utilizes an efficient spectral representation technique to reduce database size and improve query speed and performance. We systematically evaluated various spectral representation techniques and scoring functions for top-down spectral clustering and search. Our results demonstrated that TopLib is 140 times faster and achieves better reproducibility in proteoform identification compared with conventional database search methods in top-down MS.
Bioinformatics
What problem does this paper attempt to address?