Abstract:Highlights • Pseudo-targeted metabolomics and deep neutral network modeling were integrated. • This integral strategy efficiently recorded MS information of ginseng metabolome. • The DNN model gave perfect classification performance in differing PJ and PJvm. • A good example was set for discriminating the easily confused plants like ginseng. Metabolomics covers a wide range of applications in life sciences, biomedicine, and phytology. Data acquisition (to achieve high coverage and efficiency) and analysis (to pursue good classification) are two key segments involved in metabolomics workflows. Various chemometric approaches utilizing either pattern recognition or machine learning have been employed to separate different groups. However, insufficient feature extraction, inappropriate feature selection, overfitting, or underfitting lead to an insufficient capacity to discriminate plants that are often easily confused. Using two ginseng varieties, namely Panax japonicus and P. japonicus var. major , containing the similar ginsenosides, we integrated pseudo-targeted metabolomics and deep neural network (DNN) modeling to achieve accurate species differentiation. A pseudo-targeted metabolomics approach was optimized through data acquisition mode, ion pairs generation, comparison between multiple reaction monitoring (MRM) and scheduled MRM, and chromatographic elution gradient. In total, 1980 ion pairs were monitored within 23 min, allowing for the most comprehensive ginseng metabolome analysis. The established DNN model demonstrated excellent classification performance (in terms of accuracy, precision, recall, F1 score, area under the curve, and receiver operating characteristic) using the entire metabolome data and feature-selection dataset, exhibiting superior advantages over random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost), and multilayer perceptron (MLP). Moreover, DNNs were advantageous for automated feature learning, nonlinear modeling, adaptability, and generalization. This study confirmed practicality of the established strategy for efficient metabolomics data analysis and reliable classification performance even when using small-volume samples. This established approach holds promise for plant metabolomics and is not limited to ginseng. Graphical abstract Download: Download high-res image (165KB) Download: Download full-size image

HerbMet: Enhancing metabolomics data analysis for accurate identification of Chinese herbal medicines using deep learning

An Optimized Multi-Classifiers Ensemble Learning for Identification of Ginsengs Based on Electronic Nose

Feature Engineering in Discrimination of Herbal Medicines from Different Geographical Origins with Electronic Nose

Integration of deep neutral network modeling and LC-MS-based pseudo-targeted metabolomics to discriminate easily confused ginseng species

Application of Ultra-Performance LC-TOF MS Metabolite Profiling Techniques to the Analysis of Medicinal Panax Herbs

Screening Specific Biomarkers of Herbs Using a Metabolomics Approach: A Case Study of Panax ginseng

Application of Metabolomics in the Identification of Chinese Herbal Medicine

A Metabolomics Strategy for Authentication of Plant Medicines with Multiple Botanical Origins, A Case Study of Uncariae Rammulus Cum Uncis

Metabolite profiling and characterization for medicinal herbal remedies.

Image recognition of traditional Chinese medicine based on deep learning

Identification of Panax notoginseng origin using terahertz precision spectroscopy and neural network algorithm

Identify production area, growth mode, species, and grade of Astragali Radix using metabolomics “big data” and machine learning

An Identification Method of Herbal Medicines Superior to Traditional Spectroscopy: Two-dimensional Correlation Spectral Images Combined With Deep Learning

A MS-feature-based medicinal plant database-driven strategy for ingredient identification of Chinese medicine prescriptions

Prediction Methods of Herbal Compounds in Chinese Medicinal Herbs

Digital identification of Aucklandiae radix, Vladimiriae radix, and Inulae radix based on multivariate algorithms and UHPLC-QTOF-MS analysis

Classifying herbal medicine origins by temporal and spectral data mining of electronic nose

Nontargeted metabolomic analysis and "commercial-homophyletic" comparison-induced biomarkers verification for the systematic chemical differentiation of five different parts of Panax ginseng.

A method for accurate identification of Uyghur medicinal components based on Raman spectroscopy and multi-label deep learning

Authentication of Herbal Medicines from Multiple Botanical Origins with Cross-Validation Mebabolomics, Absolute Quantification and Support Vector Machine Model, a Case Study of Rhizoma Alismatis

Application of Molecular Methods in the Identification of Ingredients in Chinese Herbal Medicines