Abstract:This paper introduces three strategies to improve the ability of GAN to handle imbalanced data. The first strategy is to inject prior knowledge into the latent space of GAN. The second strategy is to inject random noise into the discriminator. The third one is to use multiple GANs to learn comprehensive probability distributions of positive class based on multi‐scale data. Tackling imbalanced problems encountered in real‐world applications poses a challenge at present. Oversampling is a widely useful method for imbalanced tabular data. However, most traditional oversampling methods generate samples by interpolation of minority (positive) class, failing to entirely capture the probability density distribution of the original data. In this paper, a novel oversampling method is presented based on generative adversarial network (GAN) with the originality of introducing three strategies to enhance the distribution of the positive class, called GAN‐E. The first strategy is to inject prior knowledge of positive class into the latent space of GAN, improving sample emulation. The second strategy is to inject random noise containing this prior knowledge into both original and generated positive samples to stretch the learning space of the discriminator of GAN. The third one is to use multiple GANs to learn comprehensive probability distributions of positive class based on multi‐scale data to eliminate the influence of GAN on generating aggregate samples. The experimental results and statistical tests obtained on 18 commonly used imbalanced datasets show that the proposed method comes with a better performance in terms of G‐mean, F‐measure, AUC and accuracy than 14 other rebalanced methods.

An improved generative adversarial network to oversample imbalanced datasets

Distribution Enhancement for Imbalanced Data with Generative Adversarial Network

An ensemble oversampling method for imbalanced classification with prior knowledge via generative adversarial network

A new imbalanced data oversampling method based on Bootstrap method and Wasserstein Generative Adversarial Network

An Improved D2GAN‐based oversampling algorithm for imbalanced data classification

Generative adversarial minority enlargement—A local linear over-sampling synthetic method

Using Bidirectional GAN with Improved Training Architecture for Imbalanced Tasks

IDA-GAN: A Novel Imbalanced Data Augmentation GAN

BSGAN: A Novel Oversampling Technique for Imbalanced Pattern Recognitions

GenSample: A Genetic Algorithm for Oversampling in Imbalanced Datasets

Incremental Focal Loss GANs.

Oversampling Imbalanced Data Based on Convergent WGAN for Network Threat Detection

Over-sampling method for tackling class imbalance in software defect prediction based on generative adversarial networks

SGBGAN: minority class image generation for class-imbalanced datasets

Improving GAN Training via Feature Space Shrinkage

Enhancing and improving the performance of imbalanced class data using novel GBO and SSG: A comparative analysis

An intra-class distribution-focused generative adversarial network approach for imbalanced tabular data learning

EID-GAN: Generative Adversarial Nets for Extremely Imbalanced Data Augmentation

A hybrid sampling method for highly imbalanced and overlapped data classification with complex distribution

Annealing Genetic GAN for Imbalanced Web Data Learning

BAGAN: Data Augmentation with Balancing GAN