To Quickly Detect the Geographical Origin of Baimudan Tea by Multi-AdaBoost Model Combined with Raman Spectroscopy

Wei Ping,Wenjing Liu,Yuwu Chi
DOI: https://doi.org/10.21203/rs.3.rs-3199350/v1
2023-01-01
Abstract:Abstract Multi-AdaBoost model has great potential in the field of spectral analysis. Baimudan tea is a type of white tea with superior quality. So far, the analysis of the geographical origin of Baimudan tea with the help of Raman spectroscopy combined with Multi-AdaBoost model has not been reported. In this paper, Raman spectroscopy combined with Multi-AdaBoost model was used to realize rapid, nondestructive, precise identification of the origin of Baimudan tea. Firstly, the Raman spectra of Baimudan tea from four different origins, including Fuan (FA), Fuding (FD), Zhenghe (ZH) and Songxi (SX) in Fujian, China, were collected. Then, K-Nearest Neighbor algorithm (KNN), Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and Decision Tree (DT) classifier model were constructed by the effective features extracted by Principal Component Analysis. Finally, the classifier models were also optimized by Multi-AdaBoost model. Results showed SVM model had the best performance and accuracy with the average recognition rates being 92.71%. In order to further optimize the identification effect of the model and improve its generalization ability, the DT model and SVM model were used as fitting classifiers to construct the Multi-AdaBoost-DT and Multi-AdaBoost-SVM model. Compared with the DT model, the recognition rates of Multi-AdaBoost-DT model for FA, FD, ZH and SX origin were all significantly increased and the average identification rate increased from 86.46% to 91.67%. Compared with the SVM model, the recognition rates of the Multi-AdaBoost-SVM model for FA and SX origin remained unchanged, due to the constructed model had reached local optimum. However, the recognition rates of FD and ZH origin were increased from 91.67% to 95.83%, 83.33% to 87.50%, respectively. And the average identification rate increased from 92.71% to 94.79%. The above results show that the Multi-AdaBoost-DT and Multi-AdaBoost-SVM models by reducing the weight of the samples incorrectly discriminated, constructed after repeated training are strong classifier models which can significantly improve the classification accuracy of the models and have a good prospect in the application of Raman spectral analysis. And the construct Multi-AdaBoost-SVM classifier model can effectively identify the geographical origin of Baimudan tea.
What problem does this paper attempt to address?