Learning Sparse Codes with Entropy-Based ELBOs

Dmytro Velychko,Simon Damm,Asja Fischer,Jörg Lücke

2024-04-10

Abstract:Standard probabilistic sparse coding assumes a Laplace prior, a linear mapping from latents to observables, and Gaussian observable distributions. We here derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilistic inference; (B) unlike for previous non-trivial approximations, the novel objective is fully analytical; and (C) the objective allows for a novel principled form of annealing. The objective is derived by first showing that the standard ELBO objective converges to a sum of entropies, which matches similar recent results for generative models with Gaussian priors. The conditions under which the ELBO becomes equal to entropies are then shown to have analytical solutions, which leads to the fully analytical objective. Numerical experiments are used to demonstrate the feasibility of learning with such entropy-based ELBOs. We investigate different posterior approximations including Gaussians with correlated latents and deep amortized approximations. Furthermore, we numerically investigate entropy-based annealing which results in improved learning. Our main contributions are theoretical, however, and they are twofold: (1) for non-trivial posterior approximations, we provide the (to the knowledge of the authors) first analytical ELBO objective for standard probabilistic sparse coding; and (2) we provide the first demonstration on how a recently shown convergence of the ELBO to entropy sums can be used for learning.

Machine Learning

What problem does this paper attempt to address?

This paper mainly discusses the problem of learning sparse coding using entropy-based Evidence Lower Bound (ELBO). Traditional sparse coding assumes Laplace prior, linear mapping from latent variables to observable values, and Gaussian observation distribution. The authors propose a novel learning objective based solely on entropy, which features: (A) using non-trivial posterior approximation for probabilistic inference; (B) a new objective function with complete analytic form; (C) a novel annealing form. They prove that the standard ELBO objective converges to the sum of entropies and find the analytic objective function by analyzing the solvable conditions. Numerical experiments validate the effectiveness of entropy-based ELBO learning, explore different posterior approximations including correlated Gaussian latent variables and deep amortized approximations, and investigate entropy annealing for improving learning performance. The main contribution of this paper is in the theoretical aspect, demonstrating for the first time how to leverage the convergence property of ELBO for learning and deriving the fully analytic ELBO objective for standard sparse coding generative models.

Learning Sparse Codes with Entropy-Based ELBOs

On the Convergence of the ELBO to Entropy Sums

Advancing Bayesian Optimization via Learning Correlated Latent Space

ED-VAE: Entropy Decomposition of ELBO in Variational Autoencoders

Efficient Probabilistic Latent Semantic Analysis with Sparsity Control

A Truncated EM Approach for Spike-and-Slab Sparse Coding

Sparse-Coding Variational Autoencoders

Covariance-Free Sparse Bayesian Learning

Nonsparse learning with latent variables

Sparse-Coding Variational Auto-Encoders

Learning Sparse Representations in Reinforcement Learning with Sparse Coding

Sparse Linear Regression with Constraints: A Flexible Entropy-based Framework

A prescriptive theory for brain-like inference

Leveraging joint sparsity in hierarchical Bayesian learning

Learning Sparsity of Representations with Discrete Latent Variables

Beyond L2-Loss Functions for Learning Sparse Models

Sparse Coding with Multi-Layer Decoders using Variance Regularization

Local Latent Space Bayesian Optimization over Structured Inputs

From Bayesian Sparsity to Gated Recurrent Nets

Inference algorithms and learning theory for Bayesian sparse factor analysis