Abstract:Brain decoding is a field of computational neuroscience that aims to infer mental states or internal representations of perceptual inputs from measurable brain activity. In this study, we propose a novel approach to brain decoding that relies on semantic and contextual similarity. We use several fMRI datasets of natural images as stimuli and create a deep learning decoding pipeline inspired by the bottom-up and top-down processes in human vision. Our pipeline includes a linear brain-to-feature model that maps fMRI activity to semantic visual stimuli features. We assume that the brain projects visual information onto a space that is homeomorphic to the latent space represented the last layer of a pretrained neural network, which summarizes and highlights similarities and differences between concepts. These features are categorized in the latent space using a nearest-neighbor strategy, and the results are used to retrieve images or condition a generative latent diffusion model to create novel images. We demonstrate semantic classification and image retrieval on three different fMRI datasets, GOD (vision perception and imagination), BOLD5000 and NSD. In all cases a simple mapping between fMRI and a deep semantic representation of the visual stimulus resulted in meaningful classification and retrieved or generated images. We assessed quality using quantitative metrics and a human evaluation experiment that reproduces the multiplicity of conscious and unconscious criteria that humans use to evaluate image similarity. Our method achieved correct evaluation in over 80% of the test set. In summary, our study proposes a novel approach to brain decoding that relies on semantic and contextual similarity. Our results demonstrate that measurable neural correlates can be linearly mapped onto the latent space of a neural network to synthesize images that match the original content. The findings have implications for both cognitive neuroscience and artificial intelligence.

Mind captioning: Evolving descriptive text of mental content from human brain activity

Describing Semantic Representations of Brain Activity Evoked by Visual Stimuli

Brain Captioning: Decoding human brain activity into images and text

MindSemantix: Deciphering Brain Visual Experiences with a Brain-Language Model

BrainChat: Decoding Semantic Information from fMRI using Vision-language Pretrained Models

MindGPT: Interpreting What You See with Non-invasive Brain Recordings

DreamCatcher: Revealing the Language of the Brain with fMRI using GPT Embedding

Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

BrainSCUBA: Fine-Grained Natural Language Captions of Visual Cortex Selectivity

Emotion Recognition with Feature Extracted from the Manifold of Brain Networks

Mind Artist: Creating Artistic Snapshots with Human Thought

A neural decoding algorithm that generates language from visual activity evoked by natural images

UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity

Brain2Word: Decoding Brain Activity for Language Generation

Decoding Linguistic Representations of Human Brain

Neuro-Vision to Language: Enhancing Brain Recording-based Visual Reconstruction and Language Interaction

A dual‐channel language decoding from brain activity with progressive transfer training

Reading visually embodied meaning from the brain: Visually grounded computational models decode visual-object mental imagery induced by written text

Retrieving and reconstructing conceptually similar images from fMRI with latent diffusion models and a neuro-inspired brain decoding model

RealMind: Zero-Shot EEG-Based Visual Decoding and Captioning Using Multi-Modal Models

Looking through the mind's eye via multimodal encoder-decoder networks