BSCGAN: Structured Minority Class Image Generation under Class-Balanced Pretraining

Qian Wan,Bin Zhou,Yanjiang Wang
DOI: https://doi.org/10.1007/s00371-024-03635-5
IF: 2.835
2024-01-01
The Visual Computer
Abstract:In the context of image generation, class imbalance often poses a challenge. Conventional generative adversarial networks (GANs) tend to generate samples predominantly from the majority class when trained on datasets with imbalanced class distributions. Generative adversarial minority oversampling (GAMO) and deep synthetic minority oversampling technique (SMOTE) extend oversampling to deep learning to improve the classification performance of minority class. But these methods neither make use of data distribution nor data structure, resulting in low data space utilization. In order to overcome the shortcomings, this study proposes a balanced pretraining for structured conditional generative adversarial network (BSCGAN), which performs oversampling in the low-dimensional latent space of the gated variational autoencoder with self-attention (SA-GVAE), the repeat minority samples (RMS) module for providing balanced pretraining conditions. At the same time, a structured samples generator (SSG) is introduced to calculate the structure loss, and retains the covariance structure of each class. The proposed BSCGAN is assessed on the Fashion-MNIST, CIFAR-10, BreakHis and WAFER datasets, demonstrating superior average class specific accuracy (ACSA) and geometric mean (GM) performance compared to state-of-the-art methods. BSCGAN improves the quantity and quality of the minority class samples. The diagram outlines the technical approach utilized in this research paper, where a novel technique named balanced pretraining for structured conditional generative adversarial network (BSCGAN) is introduced to tackle the challenge of class imbalance within the dataset. BSCGAN functions as an image augmentation tool, aiming to produce high-quality images. It performs oversamples in the low-dimensional latent space of the gated variational autoencoder with self-attention (SA-GVAE), the repeat minority samples (RMS) module to provide balanced pretraining conditions, at the same time, a structured samples generator (SSG) is introduced to calculate the structure loss and retain the covariance structure of each class. In the end, minority samples generation is utilized to rebalance the classes within the dataset, and the classifier performance is improved.
What problem does this paper attempt to address?