Taming the Tail in Class-Conditional GANs: Knowledge Sharing via Unconditional Training at Lower Resolutions

Saeed Khorram,Mingqi Jiang,Mohamad Shahbazi,Mohamad H. Danesh,Li Fuxin
2024-06-17
Abstract:Despite extensive research on training generative adversarial networks (GANs) with limited training data, learning to generate images from long-tailed training distributions remains fairly unexplored. In the presence of imbalanced multi-class training data, GANs tend to favor classes with more samples, leading to the generation of low-quality and less diverse samples in tail classes. In this study, we aim to improve the training of class-conditional GANs with long-tailed data. We propose a straightforward yet effective method for knowledge sharing, allowing tail classes to borrow from the rich information from classes with more abundant training data. More concretely, we propose modifications to existing class-conditional GAN architectures to ensure that the lower-resolution layers of the generator are trained entirely unconditionally while reserving class-conditional generation for the higher-resolution layers. Experiments on several long-tail benchmarks and GAN architectures demonstrate a significant improvement over existing methods in both the diversity and fidelity of the generated images. The code is available at <a class="link-external link-https" href="https://github.com/khorrams/utlo" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges encountered when training Class - Conditional GANs (cGANs) on long - tailed distribution datasets. Specifically, when the number of samples in some categories in the dataset is far less than that in other categories (i.e., there is a long - tailed distribution), cGANs tend to over - fit the categories with a large number of samples (head categories), while the quality of generation for the categories with a small number of samples (tail categories) is poor and lacks diversity. This imbalance results in low - quality and lack - of - diversity generated images for the tail categories. To address this challenge, the authors propose a novel method - Unconditional Training at Lower Resolution (UTLO). This method promotes knowledge sharing between head and tail categories by using unconditional GAN objectives in the lower - resolution layers of the generator, thereby improving the generation quality and diversity of the tail categories. Specifically, the low - resolution part of the generator is trained completely unconditionally, while the high - resolution part retains class - conditional generation to ensure that the finally generated images have high quality and diversity. Through experiments on multiple long - tailed benchmark datasets, the authors prove that the UTLO method is significantly superior to existing methods in improving the diversity and fidelity of cGANs - generated images. In addition, the authors also propose GAN evaluation metrics adapted to the long - tailed setting in order to more accurately evaluate the quality of generated images.