A Semi-Supervised Self-Training Ensemble Method Based on Clustering

Yue Yin,Yan Zhang,Shanshan Xie,Danjv Lv,Jing Lu,Haiyan Cheng
DOI: https://doi.org/10.1109/iccc56324.2022.10066026
2022-01-01
Abstract:In the process of training classifier with self-training algorithm, this paper proposes a self-training method based on clustering to reduce the problem of low classification accuracy due to the classification error of unlabelled samples. The method calculates the centroids of the labelled samples by KNN method and selects the samples that are close to the centroids of the class from the unlabelled samples as the samples with higher confidence. At the same time, a semi-supervised self-training ensemble method based on clustering is proposed by taking advantage of the differences among classifiers and the idea of ensemble. The results show that the comparison experiment of a semi-supervised self-training ensemble method based on clustering on public data sets and bird song data sets fully verifies the effectiveness of the proposed method.
What problem does this paper attempt to address?