A Robust Semi-Supervised Broad Learning System Guided by Ensemble-Based Self-Training
Jifeng Guo,C L Philip Chen
DOI: https://doi.org/10.1109/TCYB.2024.3393020
Abstract:Broad learning system (BLS) with semi-supervised learning relieves label dependence and expands application. Despite some efforts and progress, the semi-supervised BLS still needs improvement, especially in handling imbalanced data or concept drift scenarios for self-training-based methods. To this extent, this article proposes a robust semi-supervised BLS guided by ensemble-based self-training (ESTSS-BLS). Distinctive to self-training that assigns the pseudo-label via a single classifier and confidence, the advocated ensemble-based self-training determines the pseudo-label according to the turnout of multiple BLSs. In addition, label purity is proposed to ensure the correctness and credibility of the auxiliary training data, which is a comprehensive evaluation of the voting. During iterative learning, a small portion of labeled data first trains multiple BLSs in parallel. Then, the system recursively updates its data, structure, and meta-parameters using label purity and a data-driven dynamic nodes mechanism that dynamically guides the network's structural adjustments to solve the concept drift problem caused by a large amount of auxiliary training data. The experimental results demonstrate that ESTSS-BLS exhibits exceptional performance compared to existing methods, with the lowest-time consumption and the highest accuracy, precision, recall, F1 score, and AUC. Exhilaratingly, it achieves an accuracy of 87.84% with only 0.1% labeled data on MNIST, and with just 2% labeled data, it matches the performance of supervised learning using all training data on NORB. In addition, ESTSS-BLS also performs stably on medical or biological data, verifying its high adaptability.