Machine learning methods for compound annotation in non‐targeted mass spectrometry—A brief overview of fingerprinting, in silico fragmentation and de novo methods

Francesco F. Russo,Yannek Nowatzky,Carsten Jaeger,Maria K. Parr,Phillipp Benner,Thilo Muth,Jan Lisec
DOI: https://doi.org/10.1002/rcm.9876
IF: 2.586
2024-08-27
Rapid Communications in Mass Spectrometry
Abstract:Non‐targeted screenings (NTS) are essential tools in different fields, such as forensics, health and environmental sciences. NTSs often employ mass spectrometry (MS) methods due to their high throughput and sensitivity in comparison to, for example, nuclear magnetic resonance–based methods. As the identification of mass spectral signals, called annotation, is labour intensive, it has been used for developing supporting tools based on machine learning (ML). However, both the diversity of mass spectral signals and the sheer quantity of different ML tools developed for compound annotation present a challenge for researchers in maintaining a comprehensive overview of the field. In this work, we illustrate which ML‐based methods are available for compound annotation in non‐targeted MS experiments and provide a nuanced comparison of the ML models used in MS data analysis, unravelling their unique features and performance metrics. Through this overview we support researchers to judiciously apply these tools in their daily research. This review also offers a detailed exploration of methods and datasets to show gaps in current methods, and promising target areas, offering a starting point for developers intending to improve existing methodologies.
spectroscopy,chemistry, analytical,biochemical research methods
What problem does this paper attempt to address?