SpecieScan: Semi-Automated taxonomic identification of bone collagen peptides from MALDI-ToF MS spectra

Emese I Végh,Katerina Douka,E I Végh,K Douka
DOI: https://doi.org/10.1093/bioinformatics/btae054
IF: 5.8
2024-02-09
Bioinformatics
Abstract:Abstract Motivation Zooarchaeology by Mass Spectrometry (ZooMS) is a palaeoproteomics method for the taxonomic determination of collagen, which traditionally involves challenging manual spectra analysis with limitations in quantitative results. As the ZooMS reference database expands, a faster and reproducible identification tool is necessary. This paper presents SpecieScan, an open-access algorithm for automating taxa identification from raw MALDI-ToF Mass Spectrometry (MS) data. Results SpecieScan was developed using R (pre-processing) and Python (automation). The algorithm’s output includes identified peptide markers, closest matching taxonomic group (taxon, family, order), correlation scores with the reference databases, and contaminant peaks present in the spectra. Testing on original MS data from bones discovered at several archaeological sites, including Denisova Cave, as well as publicly available externally produced data, achieved >90% accuracy at the genus-level and ∼92% accuracy at the family-level for data of mammalian bone collagen previously analysed using manual methods. Supplementary information The SpecieScan algorithm, original raw data, results, reference databases, and contamination lists are freely available on the web and can be found on Github (https://github.com/mesve/SpecieScan).
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?