Sparse-Coding Variational Auto-Encoders

Victor Geadah,Gabriel Barello,Daniel Greenidge,Adam S. Charles,Jonathan W. Pillow
DOI: https://doi.org/10.1101/399246
2024-03-15
Abstract:The sparse coding model posits that the visual system has evolved to efficiently code natural stimuli using a sparse set of features from an overcomplete dictionary. The original sparse coding model suffered from two key limitations, however: (1) computing the neural response to an image patch required minimizing a nonlinear objective function via recurrent dynamics; (2) fitting relied on approximate inference methods that ignored uncertainty. Although subsequent work has developed several methods to overcome these obstacles, we propose a novel solution inspired by the variational auto-encoder (VAE) framework. We introduce the sparse-coding variational auto-encoder (SVAE), which augments the sparse coding model with a probabilistic recognition model parametrized by a deep neural network. This recognition model provides a neurally plausible feedforward implementation for the mapping from image patches to neural activities, and enables a principled method for fitting the sparse coding model to data via maximization of the evidence lower bound (ELBO). The SVAE differs from standard VAEs in three key respects: the latent representation is overcomplete (there are more latent dimensions than image pixels), the prior is sparse or heavy-tailed instead of Gaussian, and the decoder network is a linear projection instead of a deep network. We fit the SVAE to natural image data under different assumed prior distributions, and show that it obtains higher test performance than previous fitting methods. Finally, we examine the response properties of the recognition network and show that it captures important nonlinear properties of neurons in the early visual pathway.
Neuroscience
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are two key limitations in existing sparse - coding models when calculating neural responses and fitting data: 1. **High computational complexity**: The original sparse - coding model needs to calculate the neural responses of image patches by minimizing a nonlinear objective function, which usually requires an iterative dynamic process and has a high computational cost. 2. **Ignoring uncertainty**: Existing fitting methods rely on approximate inference methods, which ignore uncertainty information, resulting in limited interpretability of the model. To solve these problems, the author proposes a new model based on the variational auto - encoder (VAE) framework - the Sparse - Coding Variational Auto - Encoder (SVAE). SVAE provides a neurally - plausible feed - forward implementation for mapping from image patches to neural activities by introducing a probabilistic recognition model parameterized by a deep neural network. In addition, SVAE also provides a principled method for fitting sparse - coding models by maximizing the Evidence Lower Bound (ELBO). ### Main contributions 1. **Improved sparse - coding model**: - **Over - complete latent representation**: The dimension of the latent variables in SVAE is greater than the dimension of the image pixels, that is, the dimension of the latent variables is greater than the dimension of the data. - **Heavy - tailed prior distribution**: SVAE uses a heavy - tailed distribution (such as Cauchy or Laplace distribution) as the prior of the latent variables instead of the standard Gaussian distribution. - **Linear decoder**: The decoder network of SVAE is a linear projection rather than a deep neural network. 2. **Efficient fitting method**: - By maximizing ELBO, SVAE provides an efficient and principled method for fitting sparse - coding models, overcoming the limitations of traditional methods. 3. **Performance improvement**: - The author fitted SVAE on natural image data and showed that its performance on the test set is better than previous fitting methods. 4. **Biological plausibility**: - The author analyzed the response characteristics of the recognition network and found that it can capture important nonlinear characteristics of neurons in the early visual pathway, such as orientation tuning, surround suppression and frequency tuning. ### Conclusion By introducing SVAE, the author not only solves the key problems of sparse - coding models in calculation and fitting, but also provides a more efficient and reasonable method for understanding and modeling the neural responses of the visual system. This method is of great significance both theoretically and in application, especially in the fields of computational neuroscience and computer vision.