Epanechnikov Variational Autoencoder

Tian Qin,Wei-Min Huang
2024-05-21
Abstract:In this paper, we bridge Variational Autoencoders (VAEs) [17] and kernel density estimations (KDEs) [25 ],[23] by approximating the posterior by KDEs and deriving an upper bound of the Kullback-Leibler (KL) divergence in the evidence lower bound (ELBO). The flexibility of KDEs makes the optimization of posteriors in VAEs possible, which not only addresses the limitations of Gaussian latent space in vanilla VAE but also provides a new perspective of estimating the KL-divergence in ELBO. Under appropriate conditions [ 9],[3 ], we show that the Epanechnikov kernel is the optimal choice in minimizing the derived upper bound of KL-divergence asymptotically. Compared with Gaussian kernel, Epanechnikov kernel has compact support which should make the generated sample less noisy and blurry. The implementation of Epanechnikov kernel in ELBO is straightforward as it lies in the "location-scale" family of distributions where the reparametrization tricks can be directly employed. A series of experiments on benchmark datasets such as MNIST, Fashion-MNIST, CIFAR-10 and CelebA further demonstrate the superiority of Epanechnikov Variational Autoenocoder (EVAE) over vanilla VAE in the quality of reconstructed images, as measured by the FID score and Sharpness[27].
Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the following issues: 1. **Improving the posterior distribution representation of Variational Autoencoders (VAE)**: The paper proposes the Epanechnikov Variational Autoencoder (EVAE) by combining Kernel Density Estimation (KDE) with VAE to optimize the approximation of the posterior distribution. Traditional VAEs usually assume that the latent space follows a Gaussian distribution, which can be overly simplistic in some cases, limiting their expressive power. By using the Epanechnikov kernel to estimate the posterior distribution, it not only overcomes the limitations of the Gaussian latent space but also provides a new perspective for estimating the KL divergence in ELBO. 2. **Minimizing the upper bound of KL divergence**: The authors demonstrate that under certain conditions, the Epanechnikov kernel is the optimal choice for minimizing the upper bound of KL divergence. Compared to the Gaussian kernel, the Epanechnikov kernel has the advantage of compact support, which makes the generated samples clearer and less noisy. 3. **Improving image reconstruction quality**: Through a series of experiments, the paper shows that EVAE outperforms traditional VAE in terms of image reconstruction quality on standard image datasets (such as MNIST, Fashion-MNIST, CIFAR-10, and CelebA), particularly excelling in FID scores and clarity. In summary, the paper aims to improve the representation of the posterior distribution in VAE by introducing the Epanechnikov kernel and validates the superiority of this approach in practical applications.