Learning Sparse Codes with Entropy-Based ELBOs

Dmytro Velychko,Simon Damm,Asja Fischer,Jörg Lücke
2024-04-10
Abstract:Standard probabilistic sparse coding assumes a Laplace prior, a linear mapping from latents to observables, and Gaussian observable distributions. We here derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilistic inference; (B) unlike for previous non-trivial approximations, the novel objective is fully analytical; and (C) the objective allows for a novel principled form of annealing. The objective is derived by first showing that the standard ELBO objective converges to a sum of entropies, which matches similar recent results for generative models with Gaussian priors. The conditions under which the ELBO becomes equal to entropies are then shown to have analytical solutions, which leads to the fully analytical objective. Numerical experiments are used to demonstrate the feasibility of learning with such entropy-based ELBOs. We investigate different posterior approximations including Gaussians with correlated latents and deep amortized approximations. Furthermore, we numerically investigate entropy-based annealing which results in improved learning. Our main contributions are theoretical, however, and they are twofold: (1) for non-trivial posterior approximations, we provide the (to the knowledge of the authors) first analytical ELBO objective for standard probabilistic sparse coding; and (2) we provide the first demonstration on how a recently shown convergence of the ELBO to entropy sums can be used for learning.
Machine Learning
What problem does this paper attempt to address?
This paper mainly discusses the problem of learning sparse coding using entropy-based Evidence Lower Bound (ELBO). Traditional sparse coding assumes Laplace prior, linear mapping from latent variables to observable values, and Gaussian observation distribution. The authors propose a novel learning objective based solely on entropy, which features: (A) using non-trivial posterior approximation for probabilistic inference; (B) a new objective function with complete analytic form; (C) a novel annealing form. They prove that the standard ELBO objective converges to the sum of entropies and find the analytic objective function by analyzing the solvable conditions. Numerical experiments validate the effectiveness of entropy-based ELBO learning, explore different posterior approximations including correlated Gaussian latent variables and deep amortized approximations, and investigate entropy annealing for improving learning performance. The main contribution of this paper is in the theoretical aspect, demonstrating for the first time how to leverage the convergence property of ELBO for learning and deriving the fully analytic ELBO objective for standard sparse coding generative models.