Reliable cross-ion mode chemical similarity prediction between MS spectra

Niek de Jonge,David Joas,Lem-Joe Truong,Justin J.J. van der Hooft,Florian Huber
DOI: https://doi.org/10.1101/2024.03.25.586580
2024-04-02
Abstract:Mass spectrometry is commonly used to characterize metabolites in untargeted metabolomics. This can be done in positive and negative ionization mode, a choice typically guided by the fraction of metabolites a researcher is interested in. During analysis, mass spectral comparisons are widely used to enable annotation through reference libraries and to facilitate data organization through networking. However, until now, such comparisons between mass spectra were restricted to mass spectra of the same ionization mode, as the two modes generally result in very distinct fragmentation spectra. To overcome this barrier, here, we have implemented a machine learning model that can predict chemical similarity between spectra of different ionization modes. Hence, our new MS2DeepScore 2.0 model facilitates the seamless integration of positive and negative ionization mode mass spectra into one analysis pipeline. This creates entirely new options for data exploration, such as mass spectral library searching of negative ion mode spectra in positive ion mode libraries or cross-ionization mode molecular networking. Furthermore, to improve the reliability of predictions and better cope with unseen data, we have implemented a method to estimate the quality of prediction. This will help to avoid false predictions on spectra with low information content or spectra that substantially differ from the training data. We anticipate that the MS2DeepScore 2.0 model will extend our current capabilities in organizing and annotating untargeted metabolomics profiles.
Bioinformatics
What problem does this paper attempt to address?