Abstract:The regression model has higher requirements for the quality and balance of data to ensure the accuracy of predictions. However, there is a common problem of imbalanced distribution in real datasets, which directly affects the prediction accuracy of regression models. In order to solve the problem of data imbalance regression, considering the continuity of the target value and the correlation of the data and using the idea of optimization and confrontation, we propose an IRGAN (imbalanced regression generative adversarial network) algorithm. Considering the context information of the target data and the disappearance of the deep network gradient, we constructed a generation module and designed a composite loss function. In the early stages of training, the gap between the generated samples and the real samples is large, which easily causes the problem of non-convergence. A correction module is designed to train the internal relationship between the state and action as well as the subsequent state and reward of the real samples, guide the generation module to generate samples, and alleviate the non-convergence of the training process. The corrected samples and real samples are input into the discriminant module. On this basis, the confrontation idea is used to generate high-quality samples to balance the original samples. The proposed method is tested in the fields of aerospace, biology, physics, and chemistry. The similarity between the generated samples and the real samples is comprehensively measured from multiple perspectives to evaluate the quality of the generated samples, which proves the superiority of the generated module. Regression prediction is performed on the balanced samples processed by the IRGAN algorithm, and it is proven that the proposed algorithm can improve the prediction accuracy in terms of the imbalanced data regression problem.

Generative Learning for Imbalanced Data Using the Gaussian Mixed Model

Deep Generative Mixture Model for Robust Imbalance Classification

An intra-class distribution-focused generative adversarial network approach for imbalanced tabular data learning

An ensemble oversampling method for imbalanced classification with prior knowledge via generative adversarial network

Distribution Enhancement for Imbalanced Data with Generative Adversarial Network

Gaussian Distribution Based Oversampling for Imbalanced Data Classification

ConvGeN: Convex space learning improves deep-generative oversampling for tabular imbalanced classification on smaller datasets

Annealing Genetic GAN for Imbalanced Web Data Learning

Research on Imbalanced Data Classification Based on Classroom-Like Generative Adversarial Networks

A tutorial on generative adversarial networks with application to classification of imbalanced data

DGM: a data generative model to improve minority class presence in anomaly detection domain

RGAN-EL: A GAN and ensemble learning-based hybrid approach for imbalanced data classification

A new imbalanced data oversampling method based on Bootstrap method and Wasserstein Generative Adversarial Network

Generalized Oversampling for Learning from Imbalanced datasets and Associated Theory

Enhancing and improving the performance of imbalanced class data using novel GBO and SSG: A comparative analysis

Research on Imbalanced Data Regression Based on Confrontation

Global Data Distribution Weighted Synthetic Oversampling Technique for Imbalanced Learning

GMOTE: Gaussian based minority oversampling technique for imbalanced classification adapting tail probability of outliers

Generative adversarial minority enlargement—A local linear over-sampling synthetic method

A novel generative adversarial networks modelling for the class imbalance problem in high dimensional omics data

Synthetic Information towards Maximum Posterior Ratio for deep learning on Imbalanced Data