Augmenting data with generative adversarial networks: An overview

Hrvoje Ljubić,Goran Martinović,Tomislav Volarić
DOI: https://doi.org/10.3233/ida-215735
IF: 1.7
2022-03-14
Intelligent Data Analysis
Abstract:Performance of neural networks greatly depends on quality, size and balance of training dataset. In a real environment datasets are rarely balanced and training deep models over such data is one of the main challenges of deep learning. In order to reduce this problem, methods and techniques are borrowed from the traditional machine learning. Conversely, generative adversarial networks (GAN) were created and developed, a relatively new type of generative models that are based on game theory and consist of two neural networks, a generator and a discriminator. The generator’s task is to create a sample from the input noise that is based on training data distribution and the discriminator should detect those samples as fake. This process goes through a finite number of iterations until the generator successfully fools the discriminator. When this occurs, sample becomes a part of new (augmented) dataset. Even though the original GAN creates unlabeled samples, variants that soon appeared removed that limitation. Generating artificial data through these networks appears to be a meaningful solution to the imbalance problem since it turned out that artificial samples created by GAN are difficult to differentiate from the real ones. In this manner, new samples of minority class could be created and dataset imbalance ratio lowered.
computer science, artificial intelligence
What problem does this paper attempt to address?