Balanced Training of Energy-Based Models with Adaptive Flow Sampling

Louis Grenioux,Éric Moulines,Marylou Gabrié
2024-02-19
Abstract:Energy-based models (EBMs) are versatile density estimation models that directly parameterize an unnormalized log density. Although very flexible, EBMs lack a specified normalization constant of the model, making the likelihood of the model computationally intractable. Several approximate samplers and variational inference techniques have been proposed to estimate the likelihood gradients for training. These techniques have shown promising results in generating samples, but little attention has been paid to the statistical accuracy of the estimated density, such as determining the relative importance of different classes in a dataset. In this work, we propose a new maximum likelihood training algorithm for EBMs that uses a different type of generative model, normalizing flows (NF), which have recently been proposed to facilitate sampling. Our method fits an NF to an EBM during training so that an NF-assisted sampling scheme provides an accurate gradient for the EBMs at all times, ultimately leading to a fast sampler for generating new data.
Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the computational challenges encountered in the training process of energy - based models (EBMs), especially regarding the density estimation problem of multimodal data distributions. Specifically: 1. **Computational Complexity**: Energy models lack an explicit normalization constant \( Z_\theta \), which makes the likelihood calculation of the model infeasible in practice. Due to the difficulty in calculating the normalization constant, it is difficult to directly use the maximum - likelihood method to train EBMs. 2. **Sampling Difficulty**: Although traditional Monte Carlo Markov Chain (MCMC) algorithms and variational inference (VI) techniques can approximately estimate the likelihood gradient, they are not effective in dealing with multimodal distributions, especially in accurately representing the relative importance between different modes. 3. **Statistical Accuracy**: Existing training methods often overlook the statistical accuracy of the estimated density, especially in determining the relative importance of different categories in the dataset. This is particularly important for applications that require accurate modeling of multimodal data. To solve these problems, the paper proposes a new maximum - likelihood training algorithm, which combines normalizing flows (NF) to assist in the training of EBMs. The specific methods are as follows: - **Combining Normalizing Flows**: By fitting a normalizing flow to the EBM during the training process, the normalizing flow can provide accurate gradient estimates, thereby achieving fast and accurate sampling. - **Adaptive Flow Sampling**: Use a calibrated MCMC sampler, which utilizes the independent proposals provided by the normalizing flow and can quickly mix between different modes, thereby improving training efficiency and statistical accuracy. This method not only solves the computational and sampling problems in traditional methods but also improves the performance of the model on multimodal data and ensures accurate estimation of the relative importance of different modes.