Emril:Ensemble Method Based on Reinforcement Learning for Binary Classification in Imbalanced Drifting Data Streams

Muhammad Usman,Huanhuan Chen
DOI: https://doi.org/10.1016/j.neucom.2024.128259
IF: 6
2024-01-01
Neurocomputing
Abstract:The co-occurrence of evolving concepts and imbalanced data deteriorates the learning performance of classifiers in data streams. Recent studies do not account for data difficulty factors associated with class imbalance, i.e. imbalance complexity, complicating the imbalance learning under a drifting data environment. This paper proposes EMRIL, a novel batch-based ensemble method, to deal with this challenge. As part of EMRIL, Imbalance Complexity Redressing Component (EMRILICRC), a data-level balancing module, resolves the imbalance complexity to increase minority class visibility for the base classifiers of the ensemble. Additionally, a novel ensemble pool management (EMRILEPM) technique is designed using Reinforcement Learning (RL). EMRILEPM regularly updates the ensemble pool and constructs an optimal base classifier subset for predictions through effective training and evaluation policies. Handling imbalance complexity, and RL-based ensemble pool management helps EMRIL to effectively perform the binary classification task in imbalanced and evolving data streams. A comprehensive experimental evaluation is conducted with 104 data streams which contain a variety of concept drifts and imbalance ratios categorized by various data difficulty factors. The results are compared with 15 state-of-the-art methods showing the superiority of the proposed method.
What problem does this paper attempt to address?