Study of the Feasibility of Distinguishing Cigarettes of Different Brands Using an Adaboost Algorithm and Near-Infrared Spectroscopy

Chao Tan,Menglong Li,Xin Qin
DOI: https://doi.org/10.1007/s00216-007-1461-2
2007-01-01
Analytical and Bioanalytical Chemistry
Abstract:The feasibility of utilizing an Adaboost algorithm in conjuction with near-infrared (NIR) spectroscopy to automatically distinguish cigarettes of different brands was explored. Simple linear discriminant analysis (LDA) was used as the base algorithm to train all weak classifiers in Adaboost. Both principal component analysis (PCA) and its kernel version (kernel principal component analysis, KPCA) were used for feature extraction and were also compared to each other. The influence of the training set size on the final classification model was also investigated. Using a case study, it was demonstrated that Adaboost coupled with PCA or KPCA can obviously improve the ability to discriminate between samples that cannot be separated by a single linear classifier. However, in term of the overall performance, KPCA appears preferable to PCA for feature extraction, especially when the samples used for training are relatively small. The results also indicate that more training samples should be applied, if possible, in order to fully demonstrate the superiority of Adaboost. It seems that the use of an Adaboost algorithm in conjunction with NIR spectroscopy in combination with KPCA for feature extraction comprises a promising tool for distinguishing cigarettes of different brands, especially in situations where there is an obvious overlap between the NIR spectra afforded by cigarettes of different brands.
What problem does this paper attempt to address?