Abstract:Brain decoding is a field of computational neuroscience that aims to infer mental states or internal representations of perceptual inputs from measurable brain activity. In this study, we propose a novel approach to brain decoding that relies on semantic and contextual similarity. We use several fMRI datasets of natural images as stimuli and create a deep learning decoding pipeline inspired by the bottom-up and top-down processes in human vision. Our pipeline includes a linear brain-to-feature model that maps fMRI activity to semantic visual stimuli features. We assume that the brain projects visual information onto a space that is homeomorphic to the latent space represented the last layer of a pretrained neural network, which summarizes and highlights similarities and differences between concepts. These features are categorized in the latent space using a nearest-neighbor strategy, and the results are used to retrieve images or condition a generative latent diffusion model to create novel images. We demonstrate semantic classification and image retrieval on three different fMRI datasets, GOD (vision perception and imagination), BOLD5000 and NSD. In all cases a simple mapping between fMRI and a deep semantic representation of the visual stimulus resulted in meaningful classification and retrieved or generated images. We assessed quality using quantitative metrics and a human evaluation experiment that reproduces the multiplicity of conscious and unconscious criteria that humans use to evaluate image similarity. Our method achieved correct evaluation in over 80% of the test set. In summary, our study proposes a novel approach to brain decoding that relies on semantic and contextual similarity. Our results demonstrate that measurable neural correlates can be linearly mapped onto the latent space of a neural network to synthesize images that match the original content. The findings have implications for both cognitive neuroscience and artificial intelligence.

Towards Neural Foundation Models for Vision: Aligning EEG, MEG, and fMRI Representations for Decoding, Encoding, and Modality Conversion

Neural-MCRL: Neural Multimodal Contrastive Representation Learning for EEG-based Visual Decoding

Brain decoding: toward real-time reconstruction of visual perception

Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

Neuro-Vision to Language: Enhancing Brain Recording-based Visual Reconstruction and Language Interaction

Decoding visual brain representations from electroencephalography through Knowledge Distillation and latent diffusion models

Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion

Decoding Visual Neural Representations by Multimodal Learning of Brain-Visual-Linguistic Features

NT-ViT: Neural Transcoding Vision Transformers for EEG-to-fMRI Synthesis

Achieving More Human Brain-Like Vision via Human EEG Representational Alignment

Retrieving and reconstructing conceptually similar images from fMRI with latent diffusion models and a neuro-inspired brain decoding model

Unidirectional brain-computer interface: Artificial neural network encoding natural images to fMRI response in the visual cortex

Through their eyes: multi-subject Brain Decoding with simple alignment techniques

Neural Decoding of Visual Information Across Different Neural Recording Modalities and Approaches

Visual Encoding and Decoding of the Human Brain Based on Shared Features

Joint fMRI Decoding and Encoding with Latent Embedding Alignment

See Through Their Minds: Learning Transferable Neural Representation from Cross-Subject fMRI

NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties

Brain-aligning of semantic vectors improves neural decoding of visual stimuli

Quantitative fusion of NLP, fMRI, and EEG data : A mathematical model for decoding semantic processing in the brain

Brain encoding models based on multimodal transformers can transfer across language and vision