Hybrid Firefly Optimised Ensemble Classification for Drifting Data Streams with Imbalance

Blessa Binolin Pepsi M,Senthil Kumar N
DOI: https://doi.org/10.1016/j.knosys.2024.111500
IF: 8.139
2024-02-09
Knowledge-Based Systems
Abstract:Classification learning on non-stationary data may face dynamic changes from time to time. The major problem lies in addressing the imbalance among classes and the substantial cost associated with labeling instances, especially in the presence of drifts. Imbalance is due to a lower number of samples in the minority class than in the majority class. Imbalanced data results in the misclassification of data points. This paper proposes a technique for rebalancing data with an oversampling approach using imputation methods and Hybrid Firefly Optimisation algorithm as a novel classifier to perform classification. Imputation methods improve the number of minority samples on a data chunk. Firefly algorithm is optimised as a classification technique with tuned weights using boosting ensemble classifiers. The proposed system is tested on seven synthetic data and five data stream generators. The evaluation metrics like F-measure, AUC, and G-mean are analyzed to investigate the performance. For weather data with an imbalance ratio of 5%, the G-mean value increases by an average of 0.24% comparatively than existing methods. The statistical Friedman - Nemenyi test proves the stability of the proposed algorithm.
computer science, artificial intelligence
What problem does this paper attempt to address?