Integration of MALDI-TOF MS and machine learning to classify enterococci: A comparative analysis of supervised learning algorithms for species prediction

Eiseul Kim,Seung-Min Yang,Jun-Hyeok Ham,Woojung Lee,Dae-Hyun Jung,Hae-Yeong Kim
DOI: https://doi.org/10.1016/j.foodchem.2024.140931
2024-08-20
Abstract:This research focused on distinguishing distinct matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) spectral signatures of three Enterococcus species. We evaluated and compared the predictive performance of four supervised machine learning algorithms, K-nearest neighbor (KNN), support vector machine (SVM), and random forest (RF), to accurately classify Enterococcus species. This study involved a comprehensive dataset of 410 strains, generating 1640 individual spectra through on-plate and off-plate protein extraction methods. Although the commercial database correctly identified 76.9% of the strains, machine learning classifiers demonstrated superior performance (accuracy 0.991). In the RF model, top informative peaks played a significant role in the classification. Whole-genome sequencing showed that the most informative peaks are biomarkers connected to proteins, which are essential for understanding bacterial classification and evolution. The integration of MALDI-TOF MS and machine learning provides a rapid and accurate method for identifying Enterococcus species, improving healthcare and food safety.
What problem does this paper attempt to address?