Deep Generative Models through the Lens of the Manifold Hypothesis: A Survey and New Connections

Gabriel Loaiza-Ganem,Brendan Leigh Ross,Rasa Hosseinzadeh,Anthony L. Caterini,Jesse C. Cresswell
2024-09-26
Abstract:In recent years there has been increased interest in understanding the interplay between deep generative models (DGMs) and the manifold hypothesis. Research in this area focuses on understanding the reasons why commonly-used DGMs succeed or fail at learning distributions supported on unknown low-dimensional manifolds, as well as developing new models explicitly designed to account for manifold-supported data. This manifold lens provides both clarity as to why some DGMs (e.g. diffusion models and some generative adversarial networks) empirically surpass others (e.g. likelihood-based models such as variational autoencoders, normalizing flows, or energy-based models) at sample generation, and guidance for devising more performant DGMs. We carry out the first survey of DGMs viewed through this lens, making two novel contributions along the way. First, we formally establish that numerical instability of likelihoods in high ambient dimensions is unavoidable when modelling data with low intrinsic dimension. We then show that DGMs on learned representations of autoencoders can be interpreted as approximately minimizing Wasserstein distance: this result, which applies to latent diffusion models, helps justify their outstanding empirical results. The manifold lens provides a rich perspective from which to understand DGMs, and we aim to make this perspective more accessible and widespread.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is to understand the relationship between deep generative models (DGMs) and the manifold hypothesis, and use this relationship to explain why some DGMs perform better or worse when learning distributions on low - dimensional manifolds. Specifically, the paper focuses on the following aspects: 1. **Manifold Hypothesis**: - The paper assumes that high - dimensional data is usually located on an unknown low - dimensional sub - manifold, that is, the true distribution \(p_X^*\) of the data is supported on a \(d^*\)-dimensional sub - manifold \(M\), where \(d^* < D\). - This hypothesis is crucial for understanding the behavior of DGMs, because the success or failure of many DGMs can be attributed to how they handle this low - dimensional structure. 2. **Performance Differences of DGMs**: - Some DGMs (such as diffusion models and certain generative adversarial networks) outperform other models (such as likelihood - based models, e.g., variational auto - encoders, normalizing flows or energy - based models) in sample generation tasks. - The paper attempts to explain the reasons for this performance difference and provide guidance for designing more efficient DGMs. 3. **Numerical Instability and Optimization Objectives**: - The paper formally proves that when modeling low - dimensional data in a high - dimensional environment, likelihood - based methods will inevitably encounter numerical instability. - On the other hand, the paper shows that using DGMs on the representations learned by auto - encoders can approximately minimize the Wasserstein distance, which helps to explain the excellent performance of diffusion models. 4. **Manifold - Aware DGMs**: - The paper discusses various manifold - aware DGMs, including methods such as adding noise, using support - independent optimization objectives, and two - step models. - In particular, the paper proposes a new perspective that the two - step model not only jointly learns the manifold and the distribution, but also minimizes an (possibly regularized) upper bound of the Wasserstein distance, which can become tight at the optimal solution. In summary, this paper aims to systematically analyze and explain the behavior of DGMs from the perspective of the manifold hypothesis, and provide theoretical and practical guidance for designing more effective generative models.