Refining EI-MS library search results through atomic-level insights

Umit Volkan Ucak,Islambek Ashyrmamatov,Juyong Lee
DOI: https://doi.org/10.26434/chemrxiv-2024-vrqzf-v2
2024-05-14
Abstract:The inherent complexity of mass spectra and the lack of direct correlation between spectral and structural similarities retards structure elucidation and accurate peak annotation. For more accurate mass-to-fragment mapping, we propose a new approach combining CFM-EI, a fragmentation likelihood modeling tool in electron-ionization mass spectrometry (EI-MS) data, with an artificial-intelligence (AI) assisted multi-step complexity reduction strategy. Our methodology used modified atomic environments to represent fragments of super small organic molecules, and trained the Transformer model to predict the structural content of compounds based on their mass and intensity data with a peak precision of 86.1% and a recall rate of 78.4%. This holistic solution not only aids in interpreting EI-MS data by providing insights into atom types but also refines cosine similarity rankings by suggesting the inclusion or exclusion of certain atom types. Benchmark results using the NIST database demonstrated that our approach complements conventional methods by improving spectra matching through an in-depth atomic-level analysis.
Chemistry
What problem does this paper attempt to address?