Identification of geographical origins of Gastrodia elata Blume based on multisource data fusion

Hong Liu,Honggao Liu,Jieqing Li,Yuanzhong Wang
DOI: https://doi.org/10.1002/pca.3413
2024-06-27
Abstract:Introduction: Identifying the geographical origin of Gastrodia elata Blume contributes to the scientific and rational utilization of medicinal materials. In this study, infrared spectroscopy was combined with machine learning algorithms to distinguish the origin of G. elata BI. Objective: Realization of rapid and accurate identification of the origin of G. elata BI. Materials and methods: Attenuated total reflection Fourier transform infrared (ATR-FTIR) spectra and Fourier transform near-infrared (FT-NIR) spectra were collected for 306 samples of G. elata BI. Samples: Firstly, a support vector machine (SVM) model was established based on the single-spectrum and the full-spectrum fusion data. To investigate whether feature-level fusion strategy can enhance the model's performance, the sequential and orthogonalized partial least squares discriminant analysis (SO-PLS-DA) model was established to extract and combine two types of spectral features. Next, six algorithms were employed to extract feature variables, SVM model was established based on the feature-level fusion data. To avoid complicated preprocessing and feature extraction processes, a residual convolutional neural network (ResNet) model was established after converting the raw spectral data into spectral images. Results: The accuracy of the feature-level fusion model is better as compared to the single-spectrum model and the fusion model with full-spectrum, and SO-PLS-DA is simpler than feature-level fusion based on the SVM model. The ResNet model performs well in classification but requires more data to enhance its generalization capability and training effectiveness. Conclusion: Sequential and orthogonalized data fusion approaches and ResNet models are powerful solutions for identifying the geographic origin of G. elata BI.
What problem does this paper attempt to address?