GAT-GMM: Generative Adversarial Training for Gaussian Mixture Models

Farzan Farnia,William Wang,Subhro Das,Ali Jadbabaie
DOI: https://doi.org/10.48550/arXiv.2006.10293
2020-06-18
Abstract:Generative adversarial networks (GANs) learn the distribution of observed samples through a zero-sum game between two machine players, a generator and a discriminator. While GANs achieve great success in learning the complex distribution of image, sound, and text data, they perform suboptimally in learning multi-modal distribution-learning benchmarks including Gaussian mixture models (GMMs). In this paper, we propose Generative Adversarial Training for Gaussian Mixture Models (GAT-GMM), a minimax GAN framework for learning GMMs. Motivated by optimal transport theory, we design the zero-sum game in GAT-GMM using a random linear generator and a softmax-based quadratic discriminator architecture, which leads to a non-convex concave minimax optimization problem. We show that a Gradient Descent Ascent (GDA) method converges to an approximate stationary minimax point of the GAT-GMM optimization problem. In the benchmark case of a mixture of two symmetric, well-separated Gaussians, we further show this stationary point recovers the true parameters of the underlying GMM. We numerically support our theoretical findings by performing several experiments, which demonstrate that GAT-GMM can perform as well as the expectation-maximization algorithm in learning mixtures of two Gaussians.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the multi - modal distribution learning challenge encountered when learning Gaussian Mixture Models (GMMs) in Generative Adversarial Networks (GANs). Although GANs have achieved great success in learning the complex distributions of image, sound and text data, they perform poorly when learning multi - modal distributions (such as GMMs), and are particularly prone to the mode collapse phenomenon. For this reason, the paper proposes a new framework named GAT - GMM (Generative Adversarial Training for Gaussian Mixture Models), aiming to improve the performance of GANs in learning GMMs by designing specific generator and discriminator structures as well as optimization objectives. ### Specific Problems and Solutions 1. **Difficulty in Multi - modal Distribution Learning**: - **Problem**: When traditional GANs learn multi - modal distributions, especially Gaussian Mixture Models (GMMs), they are prone to mode collapse, that is, the generator only learns a part of the modes of the real distribution and ignores other modes. - **Solution**: GAT - GMM solves this problem by designing a stochastic linear generator and a quadratic discriminator architecture based on softmax. These designs enable GAT - GMM to achieve performance comparable to the Expectation - Maximization (EM) algorithm when learning GMMs. 2. **Optimization Problem**: - **Problem**: The training process of GANs usually involves non - convex optimization problems, which makes the training process very unstable and difficult to converge. - **Solution**: GAT - GMM transforms the original W2GAN problem into a non - convex - concave minimax optimization problem by introducing a regularization term. The paper proves that the Gradient Descent Ascent (GDA) method can effectively find the approximate stable points of this optimization problem. 3. **Theoretical Guarantee**: - **Problem**: There is a lack of theoretical analysis of GANs when learning multi - modal distributions, especially the performance guarantee on Gaussian Mixture Models. - **Solution**: The paper provides theoretical guarantees for GAT - GMM when learning symmetric two - Gaussian mixture models, including the bounds of approximation error, generalization error and optimization error. These theoretical results show that GAT - GMM can achieve zero approximation error when learning symmetric two - Gaussian mixture models and can generalize from empirical samples to the real distribution. ### Experimental Verification To verify the effectiveness of GAT - GMM, the paper conducts numerical experiments and compares the performance of GAT - GMM with the EM algorithm and several standard neural network - based GANs (such as vanilla GAN, WGAN - WC, WGAN - GP, etc.) when learning symmetric two - Gaussian mixture models. The experimental results show that GAT - GMM performs excellently when learning symmetric two - Gaussian mixture models, and its performance is comparable to that of the EM algorithm, and even better than other GANs in some cases. ### Summary By proposing the GAT - GMM framework, the paper solves the mode collapse problem of GANs when learning multi - modal distributions (especially GMMs), and provides theoretical guarantees. The experimental results further verify the effectiveness of GAT - GMM and show its superior performance when learning symmetric two - Gaussian mixture models.