RGAN-EL: A GAN and ensemble learning-based hybrid approach for imbalanced data classification

Hongwei Ding,Yu Sun,Zhenyu Wang,Nana Huang,Zhidong Shen,Xiaohui Cui
DOI: https://doi.org/10.1016/j.ipm.2022.103235
2022-12-22
Abstract:Imbalanced sample distribution is usually the main reason for the performance degradation of machine learning algorithms. Based on this, this study proposes a hybrid framework (RGAN-EL) combining generative adversarial networks and ensemble learning method to improve the classification performance of imbalanced data. Firstly, we propose a training sample selection strategy based on roulette wheel selection method to make GAN pay more attention to the class overlapping area when fitting the sample distribution. Secondly, we design two kinds of generator training loss, and propose a noise sample filtering method to improve the quality of generated samples. Then, minority class samples are oversampled using the improved RGAN to obtain a balanced training sample set. Finally, combined with the ensemble learning strategy, the final training and prediction are carried out. We conducted experiments on 41 real imbalanced data sets using two evaluation indexes: F1-score and AUC. Specifically, we compare RGAN-EL with six typical ensemble learning; RGAN is compared with three typical GAN models. The experimental results show that RGAN-EL is significantly better than the other six ensemble learning methods, and RGAN is greatly improved compared with three classical GAN models.
computer science, information systems,information science & library science
What problem does this paper attempt to address?