OEC: an Online Ensemble Classifier for Mining Data Streams with Noisy Labels

Ling Jian,Kai Shao,Ying Liu,Jundong Li,Xijun Liang
DOI: https://doi.org/10.1007/s10618-023-00990-0
IF: 5.406
2024-01-01
Data Mining and Knowledge Discovery
Abstract:Distilling actionable patterns from large-scale streaming data in the presence of concept drift is a challenging problem, especially when data is polluted with noisy labels. To date, various data stream mining algorithms have been proposed and extensively used in many real-world applications. Considering the functional complementation of classical online learning algorithms and with the goal of combining their advantages, we propose an Online Ensemble Classification (OEC) algorithm to integrate the predictions obtained by different base online classification algorithms. The proposed OEC method works by learning weights of different base classifiers dynamically through the classical Normalized Exponentiated Gradient (NEG) algorithm framework. As a result, the proposed OEC inherits the adaptability and flexibility of concept drift-tracking online classifiers, while maintaining the robustness of noise-resistant online classifiers. Theoretically, we show OEC algorithm is a low regret algorithm which makes it a good candidate to learn from noisy streaming data. Extensive experiments on both synthetic and real-world datasets demonstrate the effectiveness of the proposed OEC method.
What problem does this paper attempt to address?