Abstract:The sparse coding model posits that the visual system has evolved to efficiently code natural stimuli using a sparse set of features from an overcomplete dictionary. The original sparse coding model suffered from two key limitations, however: (1) computing the neural response to an image patch required minimizing a nonlinear objective function via recurrent dynamics; (2) fitting relied on approximate inference methods that ignored uncertainty. Although subsequent work has developed several methods to overcome these obstacles, we propose a novel solution inspired by the variational auto-encoder (VAE) framework. We introduce the sparse-coding variational auto-encoder (SVAE), which augments the sparse coding model with a probabilistic recognition model parametrized by a deep neural network. This recognition model provides a neurally plausible feedforward implementation for the mapping from image patches to neural activities, and enables a principled method for fitting the sparse coding model to data via maximization of the evidence lower bound (ELBO). The SVAE differs from standard VAEs in three key respects: the latent representation is overcomplete (there are more latent dimensions than image pixels), the prior is sparse or heavy-tailed instead of Gaussian, and the decoder network is a linear projection instead of a deep network. We fit the SVAE to natural image data under different assumed prior distributions, and show that it obtains higher test performance than previous fitting methods. Finally, we examine the response properties of the recognition network and show that it captures important nonlinear properties of neurons in the early visual pathway.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are two key limitations in existing sparse - coding models when calculating neural responses and fitting data: 1. **High computational complexity**: The original sparse - coding model needs to calculate the neural responses of image patches by minimizing a nonlinear objective function, which usually requires an iterative dynamic process and has a high computational cost. 2. **Ignoring uncertainty**: Existing fitting methods rely on approximate inference methods, which ignore uncertainty information, resulting in limited interpretability of the model. To solve these problems, the author proposes a new model based on the variational auto - encoder (VAE) framework - the Sparse - Coding Variational Auto - Encoder (SVAE). SVAE provides a neurally - plausible feed - forward implementation for mapping from image patches to neural activities by introducing a probabilistic recognition model parameterized by a deep neural network. In addition, SVAE also provides a principled method for fitting sparse - coding models by maximizing the Evidence Lower Bound (ELBO). ### Main contributions 1. **Improved sparse - coding model**: - **Over - complete latent representation**: The dimension of the latent variables in SVAE is greater than the dimension of the image pixels, that is, the dimension of the latent variables is greater than the dimension of the data. - **Heavy - tailed prior distribution**: SVAE uses a heavy - tailed distribution (such as Cauchy or Laplace distribution) as the prior of the latent variables instead of the standard Gaussian distribution. - **Linear decoder**: The decoder network of SVAE is a linear projection rather than a deep neural network. 2. **Efficient fitting method**: - By maximizing ELBO, SVAE provides an efficient and principled method for fitting sparse - coding models, overcoming the limitations of traditional methods. 3. **Performance improvement**: - The author fitted SVAE on natural image data and showed that its performance on the test set is better than previous fitting methods. 4. **Biological plausibility**: - The author analyzed the response characteristics of the recognition network and found that it can capture important nonlinear characteristics of neurons in the early visual pathway, such as orientation tuning, surround suppression and frequency tuning. ### Conclusion By introducing SVAE, the author not only solves the key problems of sparse - coding models in calculation and fitting, but also provides a more efficient and reasonable method for understanding and modeling the neural responses of the visual system. This method is of great significance both theoretically and in application, especially in the fields of computational neuroscience and computer vision.

Sparse-Coding Variational Auto-Encoders

Sparse-Coding Variational Autoencoders

Sparse, Geometric Autoencoder Models of V1

Improved Training of Sparse Coding Variational Autoencoder via Weight Normalization

SC-VAE: Sparse Coding-based Variational Autoencoder with Learned ISTA

Variational Autoencoder: An Unsupervised Model for Modeling and Decoding fMRI Activity in Visual Cortex

Sparse Coding with Multi-Layer Decoders using Variance Regularization

ESVAE: An Efficient Spiking Variational Autoencoder with Reparameterizable Poisson Spiking Sampling

Poisson Variational Autoencoder

Sequential sparse autoencoder for dynamic heading representation in ventral intraparietal area

Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders

Autoencoding Variational Autoencoder

Connections with Robust PCA and the Role of Emergent Sparsity in Variational Autoencoder Models

Supervising the Decoder of Variational Autoencoders to Improve Scientific Utility

Scalable Modeling of Spatiotemporal Data using the Variational Autoencoder: an Application in Glaucoma

eVAE: Evolutionary Variational Autoencoder

Model Order Selection with Variational Autoencoding

Sparsey™: event recognition via deep hierarchical sparse distributed codes

Iterative VAE as a predictive brain model for out-of-distribution generalization

Hidden Talents of the Variational Autoencoder

Uncertainty in latent representations of variational autoencoders optimized for visual tasks