Pro-IDD: Pareto-based Ensemble for Imbalanced and Drifting Data Streams

Muhammad Usman,Huanhuan Chen
DOI: https://doi.org/10.1016/j.knosys.2023.111103
IF: 8.139
2023-01-01
Knowledge-Based Systems
Abstract:Concept drifts and class imbalance are two primary challenges in supervised data stream classification, whereas their co-occurrence presents a more complicated learning problem. To tackle these challenges, this paper proposes Pro-IDD, a Pareto-based ensemble for imbalanced and drifting data streams. As part of Pro-IDD, Min++ module resolves the class imbalance issue by improving the minority class visibility such that class overlaps are reduced and small disjuncts in minority space are enlarged. Additionally, the ProEns module is designed to construct the ensemble pool by taking concept drifts and class imbalance into account. ProEns prunes the ensemble pool by using Pareto-based multi-objective learning for two measures: time-decayed recall-based weight and ensemble diversity. Experiments are conducted on 20 data streams with concept drifts and class imbalance and comparisons are reported against 10 state-of-the-art methods. Results show that improving the minority class visibility and using time-decayed recall-based weight and diversity for ensemble selection through Pareto-based multi-objective learning could improve the classification performance of data streams learners in the presence of concept drifts and class imbalance.
What problem does this paper attempt to address?