An Improved Imbalanced Data Classification Algorithm Based on SVM

Mingkui Yan,Jun Wang,Dan Li,Jin Meng
DOI: https://doi.org/10.1109/ICCSI55536.2022.9970637
2022-01-01
Abstract:When we deal with most real-world classification problems, the collected datasets are mostly imbalanced. Dataset imbalance means that the number of samples of a certain class greatly exceeds the number of samples of other classes in the dataset, but often a minority class is the main object of our research. When classifying imbalanced datasets, it is easy to misclassify the minority class samples with higher misclassification costs. Therefore, the classification of imbalanced datasets is one of the main difficulties in the field of data mining. In this paper, we propose a support vector machine (SVM) algorithm based on improved whale optimization algorithm, called SWOA-SVM. This algorithm introduces the social group optimization algorithm (SGO) to optimize the problem that the WOA algorithm is prone to premature maturity, and improves the optimization process of the WOA. The performance of SWOA-SVM has been evaluated with SVM and other improved algorithms on multiple commonly used imbalanced datasets, using AUC, Accuracy and G-mean as performance evaluation criteria. The experimental results show that the algorithm can effectively improve the recognition rate of positive samples when dealing with different experimental datasets, which verifies the effectiveness of the algorithm.
What problem does this paper attempt to address?