DADA: Deep Adversarial Data Augmentation for Extremely Low Data Regime Classification

Xiaofeng Zhang,Zhangyang Wang,Dong Liu,Qing Ling
DOI: https://doi.org/10.48550/arXiv.1809.00981
2018-08-29
Abstract:Deep learning has revolutionized the performance of classification, but meanwhile demands sufficient labeled data for training. Given insufficient data, while many techniques have been developed to help combat overfitting, the challenge remains if one tries to train deep networks, especially in the ill-posed extremely low data regimes: only a small set of labeled data are available, and nothing -- including unlabeled data -- else. Such regimes arise from practical situations where not only data labeling but also data collection itself is expensive. We propose a deep adversarial data augmentation (DADA) technique to address the problem, in which we elaborately formulate data augmentation as a problem of training a class-conditional and supervised generative adversarial network (GAN). Specifically, a new discriminator loss is proposed to fit the goal of data augmentation, through which both real and augmented samples are enforced to contribute to and be consistent in finding the decision boundaries. Tailored training techniques are developed accordingly. To quantitatively validate its effectiveness, we first perform extensive simulations to show that DADA substantially outperforms both traditional data augmentation and a few GAN-based options. We then extend experiments to three real-world small labeled datasets where existing data augmentation and/or transfer learning strategies are either less effective or infeasible. All results endorse the superior capability of DADA in enhancing the generalization ability of deep networks trained in practical extremely low data regimes. Source code is available at <a class="link-external link-https" href="https://github.com/SchafferZhang/DADA" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to solve the problem of training deep network classifiers with extremely limited labeled data. Specifically, in this case, not only is the labeled data limited, but also the unlabeled data is unavailable. Such extremely low - data situations are very common in practical applications, especially in scenarios where data labeling or collection itself is very expensive, such as in military images, medical images and other fields. The author proposes a technique based on Deep Adversarial Data Augmentation (DADA) to meet this challenge. By carefully designing a class - conditional supervised Generative Adversarial Network (GAN), DADA aims to generate synthetic samples that can effectively improve classification performance, thereby enhancing the generalization ability of the model in the case of small samples. ### Main contributions of the paper: 1. **Proposing the DADA framework**: In order to train deep classifiers with extremely low data volumes, the paper introduces a learning - based data augmentation method that does not rely on any domain - specific prior knowledge or unlabeled data. The data augmentation module and the classifier are jointly modeled as a fully - supervised Generative Adversarial Network (GAN). 2. **A new loss function**: A new GAN discriminator loss function, called the 2k loss function, is proposed. This loss function not only learns to classify real images but also enforces fine - grained classification among multiple "fake classes" generated. This makes the generated augmented samples distinguishable between different classes, and the decision boundaries of the generated samples are consistent with those of the real samples. 3. **Experimental verification**: Through extensive experiments on datasets such as CIFAR - 10, CIFAR - 100 and SVHN, the significant performance improvement of DADA compared to traditional data augmentation and other GAN - based methods in extremely low - data situations is demonstrated. In addition, the paper also conducts experiments on three actual small datasets to further verify the effectiveness of DADA. ### Experimental results: - **CIFAR - 10 and CIFAR - 100**: On these datasets, DADA performs well under different numbers of training samples, especially when the number of samples is small (for example, 500 samples per class). DADA_augmented (DADA combined with traditional data augmentation) performs best in all experimental settings, improving the Top - 1 accuracy by about 8% compared to other methods. - **SVHN**: On the SVHN dataset, DADA also performs best in extremely low - data situations, but it slightly declines when there are more samples (500 samples per class), yet it is still better than other methods. - **Actual small datasets**: The paper conducts experiments on actual small datasets such as Karolinska Directed Emotional Faces (KDEF), Brain - Computer Interface (BCI) Competition and Curated Breast Imaging Subset of the Digital Database for Screening Mammography (CBIS - DDSM). The results show that DADA can also significantly improve classification performance on these datasets. In conclusion, by proposing the DADA framework, this paper successfully solves the difficult problem of training deep classifiers with extremely low data volumes and verifies its effectiveness on multiple datasets.