Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

Yanchen Wang,Adam Turnbull,Tiange Xiang,Yunlong Xu,Sa Zhou,Adnan Masoud,Shekoofeh Azizi,Feng Vankee Lin,Ehsan Adeli

2024-11-12

Abstract:Neural decoding, the process of understanding how brain activity corresponds to different stimuli, has been a primary objective in cognitive sciences. Over the past three decades, advancements in functional Magnetic Resonance Imaging and machine learning have greatly improved our ability to map visual stimuli to brain activity, especially in the visual cortex. Concurrently, research has expanded into decoding more complex processes like language and memory across the whole brain, utilizing techniques to handle greater variability and improve signal accuracy. We argue that "seeing" involves more than just mapping visual stimuli onto the visual cortex; it engages the entire brain, as various emotions and cognitive states can emerge from observing different scenes. In this paper, we develop algorithms to enhance our understanding of visual processes by incorporating whole-brain activation maps while individuals are exposed to visual stimuli. We utilize large-scale fMRI encoders and Image generative models pre-trained on large public datasets, which are then fine-tuned through Image-fMRI contrastive learning. Our models hence can decode visual experience across the entire cerebral cortex, surpassing the traditional confines of the visual cortex. We first compare our method with state-of-the-art approaches to decoding visual processing and show improved predictive semantic accuracy by 43%. A network ablation analysis suggests that beyond the visual cortex, the default mode network contributes most to decoding stimuli, in line with the proposed role of this network in sense-making and semantic processing. Additionally, we implemented zero-shot imagination decoding on an extra validation dataset, achieving a p-value of 0.0206 for mapping the reconstructed images and ground-truth text stimuli, which substantiates the model's capability to capture semantic meanings across various scenarios.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to decode visual experience and semantic mapping through whole - brain analysis. Specifically, researchers hope to improve the understanding of the relationship between brain activity and different stimuli by using the functional magnetic resonance imaging (fMRI) - based model, especially how to map visual stimuli onto brain activity. Traditional research has mainly focused on the visual cortex, while this paper proposes a new method that uses the whole - brain activation map to enhance the understanding of visual processes. This method is not limited to the visual cortex but takes into account the activity patterns of the entire brain. The main contribution of the paper lies in the development of an algorithm that can decode visual experiences across the entire cerebral cortex by integrating the whole - brain activation maps of individuals when exposed to visual stimuli, going beyond the limitations of the traditional visual cortex. Researchers utilize large - scale fMRI encoders and image - generation models (encoders and decoders) that are pre - trained on large public datasets and fine - tuned through image - fMRI contrastive learning. Through this method, they not only improve their ability to predict semantic accuracy but also reveal that, apart from the visual cortex, the default mode network contributes the most to decoding stimuli, which is consistent with the role of this network in meaning construction and semantic processing. In addition, this study also achieves zero - shot imaginative decoding and reaches a significant p - value (0.0206) for mapping reconstructed images and real - text stimuli on an additional validation dataset, demonstrating the model's ability to capture semantic meanings in various scenarios. These findings emphasize the potential of using comprehensive models in enhancing the fine - grained interpretation of semantic processes.

Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

MindGPT: Interpreting What You See with Non-invasive Brain Recordings

‘When’ and ‘what’ did you see? A novel fMRI-based visual decoding framework

Brain decoding: toward real-time reconstruction of visual perception

From Sight to Insight: A Multi-task Approach with the Visual Language Decoding Model

Neuro-Vision to Language: Enhancing Brain Recording-based Visual Reconstruction and Language Interaction

MindSemantix: Deciphering Brain Visual Experiences with a Brain-Language Model

Decoding dynamic visual scenes across the brain hierarchy

Brain-aligning of semantic vectors improves neural decoding of visual stimuli

Reading visually embodied meaning from the brain: Visually grounded computational models decode visual-object mental imagery induced by written text

fMRI-based Decoding of Visual Information from Human Brain Activity: A Brief Review

A Visual Encoding Model Based on Deep Neural Networks and Transfer Learning for Brain Activity Measured by Functional Magnetic Resonance Imaging

Multi-Semantic Decoding of Visual Perception with Graph Neural Networks

Efficient Neural Decoding Based on Multimodal Training

The brain-inspired decoder for natural visual image reconstruction

Decoding region-level visual functions from invasive EEG data

Brain Dialogue Interface (BDI): A User-Friendly fMRI Model for Interactive Brain Decoding

An fMRI-based visual decoding framework combined with two-stage learning and asynchronous iterative update strategy

Modality-Agnostic fMRI Decoding of Vision and Language

Neural Decoding of Visual Information Across Different Neural Recording Modalities and Approaches

Retrieving and reconstructing conceptually similar images from fMRI with latent diffusion models and a neuro-inspired brain decoding model