Application of Unsupervised Feature Selection in Cashmere and Wool Fiber Recognition
Yaolin ZhuXingze WangMeihua GuGang HuWenya Lia School of Electronics and Information,Xi'an Polytechnic University,Xi'an,Chinab School of Science,Xi'an University of Technology,Xi'an,Chinac School of Textile Science and Engineering,Xi'an Polytechnic University,Xi'an,China
DOI: https://doi.org/10.1080/15440478.2024.2311306
2024-02-12
Journal of Natural Fibers
Abstract:Suitable features are the key to identifying cashmere and wool fibers, and feature selection is an important step in classification. Existing supervised feature selection methods need to consider the information between fiber features and class labels. Aiming at making up for this deficiency, we propose an unsupervised feature selection method based on k-means clustering, which overcome the difficulty that fiber feature class labels are either unavailable or costly to obtain. Firstly, the subset of fiber features that have been normalized are clustered by the k-means clustering algorithm to obtain the total number of clusters, and the clustering effect is evaluated by the DB Index criterion. Next, the DB value of each feature subset, the correlation of features and the total number of the clustering are considered as the judgment criteria to select the optimal feature subset. Finally, the optimal subset of features obtained by unsupervised feature selection algorithms is fed into a support vector machine for automatic identification and classification of the two fibers. The experimental results show that the method achieves a high recognition rate of 97.25%. It is verified that the unsupervised feature selection method based on k-means clustering is effective for the recognition of cashmere and wool. 合适的特征是识别羊绒和羊毛纤维的关键,特征选择是分类的重要步骤. 现有的监督特征选择方法需要考虑纤维特征和类别标签之间的信息. 为了弥补这一不足,我们提出了一种基于k-均值聚类的无监督特征选择方法,该方法克服了纤维特征类标签不可用或获取成本高的困难. 首先,通过k-均值聚类算法对已归一化的纤维特征子集进行聚类,得到聚类总数,并通过DB Index准则评估聚类效果. 接下来,将每个特征子集的DB值、特征的相关性和聚类总数作为选择最优特征子集的判断标准. 最后,将无监督特征选择算法获得的最优特征子集输入到支持向量机中,用于两种纤维的自动识别和分类. 实验结果表明,该方法的识别率高达97.25%. 验证了基于k均值聚类的无监督特征选择方法对羊绒和羊毛的识别是有效的.
materials science, textiles