Non-Destructive Identification of Wool and Cashmere Fibers Based on Cascade Optimizations of Interval-Wavelength Selection Using NIR Spectroscopy
Xin Chen Qingle Lan Yaolin Zhu Jinni Chen a School of Electronics and Information,Xi'an Polytechnic University,Xi'an,Chinab School of Automation,Northwestern Polytechnical University,Xi'an,China
DOI: https://doi.org/10.1080/15440478.2024.2409877
2024-10-02
Journal of Natural Fibers
Abstract:Near-infrared (NIR) spectroscopy is an effective method for identifying wool and cashmere fibers, with high spectral data providing a wealth of information. However, a key issue is that the accuracy and robustness of subsequent estimates can be reduced by redundant and interfering wavelengths. For this reason, a novel interval-wavelength cascaded optimization method is proposed. Initially, the collected spectral data are preprocessed by standard normal variate transformation (SNV) to eliminate the scattering effect. Then, the backward interval partial least squares (BiPLS) algorithm is applied for the preliminary selection of spectral intervals, followed by the application of three different variable selection algorithms, competitive adaptive reweighted sampling (CARS), successive projection algorithm (SPA) and whale optimization algorithm (WOA), for secondary wavelength optimization, respectively. Finally, both support vector machine (SVM) and random forest (RF) discriminant models are built to identify the extracted subset of wavelengths. In the experimental stage, the cascade method BiPLS-WOA selects 36 wavelengths, in SVM, the accuracy of the validation set reaches 96.9%, and the area under the ROC curve (AUC) can reach 99.3%. The results demonstrate that the proposed method can eliminate redundant and collinear variables, thereby validating the effectiveness of distinguishing wool and cashmere fibers. 近红外光谱是一种有效的羊毛和羊绒纤维鉴别方法,高光谱数据提供了丰富的信息. 然而,一个关键问题是,冗余和干扰波长会降低后续估计的准确性和鲁棒性. 因此,提出了一种新的区间波长级联优化方法. 首先,收集的光谱数据通过标准正态变量变换(SNV)进行预处理,以消除散射效应. 然后,将后向区间偏最小二乘(BiPLS)算法应用于光谱区间的初步选择,然后分别应用三种不同的变量选择算法,即竞争自适应重加权采样(CARS)、连续投影算法(SPA)和鲸鱼优化算法(WOA)进行二次波长优化. 最后,建立支持向量机(SVM)和随机森林(RF)判别模型来识别提取的波长子集. 在实验阶段,级联方法BiPLS WOA选择了36个波长,在SVM中,验证集的准确率达到96.9%,ROC曲线下面积(AUC)可达99.3%. 结果表明,该方法可以消除冗余和共线变量,从而验证了区分羊毛和羊绒纤维的有效性.
materials science, textiles