Handling Class Imbalance and Overlap with a Hesitation-based Instance Selection Method

Mona Moradi,Javad Hamidzadeh
DOI: https://doi.org/10.1016/j.knosys.2024.111745
IF: 8.139
2024-04-03
Knowledge-Based Systems
Abstract:Class imbalance is a common problem in machine learning, particularly in classification tasks. When the distribution of instances across known classes is biased or skewed, this issue leads to poor predictive performance. This is especially true for the minority class, which is often of greater importance in problems. However, class imbalance is not the only factor that can decrease performance. Overlapping problems and borderline instances can also degrade classification performance. Conventional imbalanced learning methods often balance the distribution between classes, for example, by oversampling the minority class or undersampling the majority class. However, these methods may not adequately address the difficulties caused by overlapping and borderline instances. The present paper expands on the classification of imbalanced datasets by addressing the issue of how boundary instances should be sampled to handle class imbalance and control class overlap. The designed Hesitation degree-based instance weighting method can identify the impact of instances on classifier performance while reducing the skewness of the dataset and alleviating the possibility of class overlap. Additionally, by integrating a chaotic evolutionary algorithm with the designed classifier, the most important instances can be selected. Statistical results show that the proposed method outperforms state-of-the-art methods in terms of reduction rate, error rate, and G-mean.
computer science, artificial intelligence
What problem does this paper attempt to address?