Abstract:A challenging goal of neural coding is to characterize the neural representations underlying visual perception. To this end, multi-unit activity (MUA) of macaque visual cortex was recorded in a passive fixation task upon presentation of faces and natural images. We analyzed the relationship between MUA and latent representations of state-of-the-art deep generative models, including the conventional and feature-disentangled representations of generative adversarial networks (GANs) (i.e., z - and w -latents of StyleGAN, respectively) and language-contrastive representations of latent diffusion networks (i.e., CLIP-latents of Stable Diffusion). A mass univariate neural encoding analysis of the latent representations showed that feature-disentangled w representations outperform both z and CLIP representations in explaining neural responses. Further, w -latent features were found to be positioned at the higher end of the complexity gradient which indicates that they capture visual information relevant to high-level neural activity. Subsequently, a multivariate neural decoding analysis of the feature-disentangled representations resulted in state-of-the-art spatiotemporal reconstructions of visual perception. Taken together, our results not only highlight the important role of feature-disentanglement in shaping high-level neural representations underlying visual perception but also serve as an important benchmark for the future of neural coding. Neural coding seeks to understand how the brain represents the world by modeling the relationship between stimuli and internal neural representations thereof. This field focuses on predicting brain responses to stimuli (neural encoding) and deciphering information about stimuli from brain activity (neural decoding). Recent advances in generative adversarial networks (GANs; a type of machine learning model) have enabled the creation of photorealistic images. Like the brain, GANs also have internal representations of the images they create, referred to as "latents". More recently, a new type of feature-disentangled " w -latent" of GANs has been developed that more effectively separates different image features (e.g., color; shape; texture). In our study, we presented such GAN-generated pictures to a macaque with cortical implants and found that the underlying w -latents were accurate predictors of high-level brain activity. We then used these w -latents to reconstruct the perceived images with high fidelity. The remarkable similarities between our predictions and the actual targets indicate alignment in how w -latents and neural representations represent the same stimulus, even though GANs have never been optimized on neural data. This implies a general principle of shared encoding of visual phenomena, emphasizing the importance of feature disentanglement in deeper visual areas.

Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain

Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain

MindGPT: Interpreting What You See with Non-invasive Brain Recordings

Perception-to-Image: Reconstructing Natural Images from the Brain Activity of Visual Perception.

PAM: Predictive attention mechanism for neural decoding of visual perception

Brain decoding: toward real-time reconstruction of visual perception

A General Framework for Revealing Human Mind with auto-encoding GANs

Reconstructing Perceptive Images from Brain Activity by Shape-Semantic GAN

Semantics-Guided Hierarchical Feature Encoding Generative Adversarial Network for Visual Image Reconstruction From Brain Activity

Visual Image Decoding of Brain Activities Using a Dual Attention Hierarchical Latent Generative Network with Multiscale Feature Fusion

Visual Image Decoding of Brain Activities using a Dual Attention Hierarchical Latent Generative Network with Multi-Scale Feature Fusion

Disentangled deep generative models reveal coding principles of the human face processing network

Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

MindCeive: Perceiving human imagination using CNN-GRU and GANs

BigGAN-based Bayesian reconstruction of natural images from human brain activity

Decoding Visual Neural Representations by Multimodal Learning of Brain-Visual-Linguistic Features

NeuroGAN: image reconstruction from EEG signals via an attention-based GAN

Deep generative networks reveal the tuning of neurons in IT and predict their influence on visual perception

Evolving Images for Visual Neurons Using a Deep Generative Network Reveals Coding Principles and Neuronal Preferences

Toward Generalizing Visual Brain Decoding to Unseen Subjects

A GAN model encoded by CapsEEGNet for visual EEG encoding and image reproduction