Abstract:This paper introduces three strategies to improve the ability of GAN to handle imbalanced data. The first strategy is to inject prior knowledge into the latent space of GAN. The second strategy is to inject random noise into the discriminator. The third one is to use multiple GANs to learn comprehensive probability distributions of positive class based on multi‐scale data. Tackling imbalanced problems encountered in real‐world applications poses a challenge at present. Oversampling is a widely useful method for imbalanced tabular data. However, most traditional oversampling methods generate samples by interpolation of minority (positive) class, failing to entirely capture the probability density distribution of the original data. In this paper, a novel oversampling method is presented based on generative adversarial network (GAN) with the originality of introducing three strategies to enhance the distribution of the positive class, called GAN‐E. The first strategy is to inject prior knowledge of positive class into the latent space of GAN, improving sample emulation. The second strategy is to inject random noise containing this prior knowledge into both original and generated positive samples to stretch the learning space of the discriminator of GAN. The third one is to use multiple GANs to learn comprehensive probability distributions of positive class based on multi‐scale data to eliminate the influence of GAN on generating aggregate samples. The experimental results and statistical tests obtained on 18 commonly used imbalanced datasets show that the proposed method comes with a better performance in terms of G‐mean, F‐measure, AUC and accuracy than 14 other rebalanced methods.

Enhancing and improving the performance of imbalanced class data using novel GBO and SSG: A comparative analysis

Imbalanced Data Sets Classification Method Based on Over-Sampling Technique

BSGAN: A Novel Oversampling Technique for Imbalanced Pattern Recognitions

SMOTified-GAN for class imbalanced pattern classification problems

CEGAN: Classification Enhancement Generative Adversarial Networks for unraveling data imbalance problems

A Novel Adaptive Minority Oversampling Technique for Improved Classification in Data Imbalanced Scenarios

Modified-generative adversarial networks for imbalance text classification

An ensemble oversampling method for imbalanced classification with prior knowledge via generative adversarial network

Distribution Enhancement for Imbalanced Data with Generative Adversarial Network

Deep convolutional neural networks with genetic algorithm-based synthetic minority over-sampling technique for improved imbalanced data classification

A new imbalanced data oversampling method based on Bootstrap method and Wasserstein Generative Adversarial Network

SGBGAN: minority class image generation for class-imbalanced datasets

GMOTE: Gaussian based minority oversampling technique for imbalanced classification adapting tail probability of outliers

IB-GAN: A Unified Approach for Multivariate Time Series Classification under Class Imbalance

Evaluating the Utility of GAN Generated Synthetic Tabular Data for Class Balancing and Low Resource Settings

A Classfication Method For Imbalance Data Set Based on Kernel SMOTE

A novel generative adversarial networks modelling for the class imbalance problem in high dimensional omics data

An improved generative adversarial network to oversample imbalanced datasets

A cluster-based SMOTE both-sampling (CSBBoost) ensemble algorithm for classifying imbalanced data

Binary imbalanced data classification based on diversity oversampling by generative models

An intra-class distribution-focused generative adversarial network approach for imbalanced tabular data learning