Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

Yanchen Wang,Adam Turnbull,Tiange Xiang,Yunlong Xu,Sa Zhou,Adnan Masoud,Shekoofeh Azizi,Feng Vankee Lin,Ehsan Adeli
2024-11-12
Abstract:Neural decoding, the process of understanding how brain activity corresponds to different stimuli, has been a primary objective in cognitive sciences. Over the past three decades, advancements in functional Magnetic Resonance Imaging and machine learning have greatly improved our ability to map visual stimuli to brain activity, especially in the visual cortex. Concurrently, research has expanded into decoding more complex processes like language and memory across the whole brain, utilizing techniques to handle greater variability and improve signal accuracy. We argue that "seeing" involves more than just mapping visual stimuli onto the visual cortex; it engages the entire brain, as various emotions and cognitive states can emerge from observing different scenes. In this paper, we develop algorithms to enhance our understanding of visual processes by incorporating whole-brain activation maps while individuals are exposed to visual stimuli. We utilize large-scale fMRI encoders and Image generative models pre-trained on large public datasets, which are then fine-tuned through Image-fMRI contrastive learning. Our models hence can decode visual experience across the entire cerebral cortex, surpassing the traditional confines of the visual cortex. We first compare our method with state-of-the-art approaches to decoding visual processing and show improved predictive semantic accuracy by 43%. A network ablation analysis suggests that beyond the visual cortex, the default mode network contributes most to decoding stimuli, in line with the proposed role of this network in sense-making and semantic processing. Additionally, we implemented zero-shot imagination decoding on an extra validation dataset, achieving a p-value of 0.0206 for mapping the reconstructed images and ground-truth text stimuli, which substantiates the model's capability to capture semantic meanings across various scenarios.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to decode visual experience and semantic mapping through whole - brain analysis. Specifically, researchers hope to improve the understanding of the relationship between brain activity and different stimuli by using the functional magnetic resonance imaging (fMRI) - based model, especially how to map visual stimuli onto brain activity. Traditional research has mainly focused on the visual cortex, while this paper proposes a new method that uses the whole - brain activation map to enhance the understanding of visual processes. This method is not limited to the visual cortex but takes into account the activity patterns of the entire brain. The main contribution of the paper lies in the development of an algorithm that can decode visual experiences across the entire cerebral cortex by integrating the whole - brain activation maps of individuals when exposed to visual stimuli, going beyond the limitations of the traditional visual cortex. Researchers utilize large - scale fMRI encoders and image - generation models (encoders and decoders) that are pre - trained on large public datasets and fine - tuned through image - fMRI contrastive learning. Through this method, they not only improve their ability to predict semantic accuracy but also reveal that, apart from the visual cortex, the default mode network contributes the most to decoding stimuli, which is consistent with the role of this network in meaning construction and semantic processing. In addition, this study also achieves zero - shot imaginative decoding and reaches a significant p - value (0.0206) for mapping reconstructed images and real - text stimuli on an additional validation dataset, demonstrating the model's ability to capture semantic meanings in various scenarios. These findings emphasize the potential of using comprehensive models in enhancing the fine - grained interpretation of semantic processes.