Anomaly detection-based undersampling for imbalanced classification problems

You-Jin Park,Paula Brito,Yun-Chen Ma
DOI: https://doi.org/10.1080/0305215x.2024.2315501
IF: 2.5
2024-02-22
Engineering Optimization
Abstract:In various machine learning applications, classification plays an important role in categorizing and predicting data. To improve the classification performance, it is crucial to identify and remove the anomalies. Also, class imbalance in many machine learning applications is a very common problem since most classifiers tend to be biased toward the majority class by ignoring the minority class instances. Thus, in this research, we propose a new under-sampling technique based on anomaly detection and removal to enhance the performance of imbalanced classification problems. To demonstrate the effectiveness of the proposed method, comprehensive experiments are conducted on forty imbalanced data sets and two non-parametric hypothesis tests are employed to show the statistical difference in classification performances between the proposed method and other traditional resampling methods. From the experiment, it is shown that the proposed method improves the classification performance by effectively detecting and eliminating the anomalies among true-majority or pseudo-majority class instances.
engineering, multidisciplinary,operations research & management science
What problem does this paper attempt to address?