CoDeGAN: Contrastive Disentanglement for Generative Adversarial Network

Jiangwei Zhao,Zejia Liu,Xiaohan Guo,Lili Pan
2024-05-31
Abstract:Disentanglement, a critical concern in interpretable machine learning, has also garnered significant attention from the computer vision community. Many existing GAN-based class disentanglement (unsupervised) approaches, such as InfoGAN and its variants, primarily aim to maximize the mutual information (MI) between the generated image and its latent codes. However, this focus may lead to a tendency for the network to generate highly similar images when presented with the same latent class factor, potentially resulting in mode collapse or mode dropping. To alleviate this problem, we propose \texttt{CoDeGAN} (Contrastive Disentanglement for Generative Adversarial Networks), where we relax similarity constraints for disentanglement from the image domain to the feature domain. This modification not only enhances the stability of GAN training but also improves their disentangling capabilities. Moreover, we integrate self-supervised pre-training into CoDeGAN to learn semantic representations, significantly facilitating unsupervised disentanglement. Extensive experimental results demonstrate the superiority of our method over state-of-the-art approaches across multiple benchmarks. The code is available at <a class="link-external link-https" href="https://github.com/learninginvision/CoDeGAN" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve better representation disentanglement in Generative Adversarial Networks (GANs), especially when dealing with the disentanglement of discrete factors. Specifically, existing GAN - based disentanglement methods, such as InfoGAN and its variants, mainly achieve disentanglement by maximizing the Mutual Information (MI) between the generated images and their latent codes. However, this method may cause the generated images to be too similar when given the same latent class factors, thus leading to the problems of mode collapse or mode dropping. To solve these problems, the authors propose the Contrastive Disentanglement for Generative Adversarial Networks (CoDeGAN). The main contributions of CoDeGAN include: 1. **Relaxing the similarity constraint**: Transfer the similarity constraint of disentanglement from the image domain to the feature domain, which not only enhances the stability of GAN training but also improves its disentanglement ability. 2. **Self - supervised pre - training**: Integrate self - supervised pre - training into CoDeGAN to learn semantic representations, which significantly promotes unsupervised disentanglement. 3. **Performance improvement**: Experimental results show that CoDeGAN outperforms existing methods on multiple benchmark datasets. Especially on the CIFAR - 10 dataset, it achieves an absolute improvement of 19% and 16% compared to InfoGAN and the previous state - of - the - art method respectively. Through these improvements, CoDeGAN aims to improve the quality of generated images and the accuracy of disentanglement, while reducing the risks of mode collapse and mode dropping.