See Through Their Minds: Learning Transferable Neural Representation from Cross-Subject fMRI

Yulong Liu,Yongqiang Ma,Guibo Zhu,Haodong Jing,Nanning Zheng
2024-06-13
Abstract:Deciphering visual content from functional Magnetic Resonance Imaging (fMRI) helps illuminate the human vision system. However, the scarcity of fMRI data and noise hamper brain decoding model performance. Previous approaches primarily employ subject-specific models, sensitive to training sample size. In this paper, we explore a straightforward but overlooked solution to address data scarcity. We propose shallow subject-specific adapters to map cross-subject fMRI data into unified representations. Subsequently, a shared deeper decoding model decodes cross-subject features into the target feature space. During training, we leverage both visual and textual supervision for multi-modal brain decoding. Our model integrates a high-level perception decoding pipeline and a pixel-wise reconstruction pipeline guided by high-level perceptions, simulating bottom-up and top-down processes in neuroscience. Empirical experiments demonstrate robust neural representation learning across subjects for both pipelines. Moreover, merging high-level and low-level information improves both low-level and high-level reconstruction metrics. Additionally, we successfully transfer learned general knowledge to new subjects by training new adapters with limited training data. Compared to previous state-of-the-art methods, notably pre-training-based methods (Mind-Vis and fMRI-PTE), our approach achieves comparable or superior results across diverse tasks, showing promise as an alternative method for cross-subject fMRI data pre-training. Our code and pre-trained weights will be publicly released at <a class="link-external link-https" href="https://github.com/YulongBonjour/See_Through_Their_Minds" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Human-Computer Interaction
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the issues of data scarcity and noise encountered when decoding visual content from functional magnetic resonance imaging (fMRI) data. Specifically, the authors point out: 1. **Data Scarcity**: The collection of fMRI data is costly and susceptible to physiological noise, resulting in a limited number of training samples per participant and a low signal-to-noise ratio. 2. **Individual Differences**: Since each individual's brain has unique functional and structural characteristics, most existing methods require building a model from scratch for each new individual, leading to significant variability in results. 3. **Overfitting**: Training models from scratch on small datasets easily leads to overfitting. To address these issues, the authors propose a new method to improve the generalization ability of the model through cross-subject fMRI data pre-training and transfer learning. The specific methods include: - **Shallow Subject Adapter**: Used to map cross-subject fMRI data into a unified feature space. - **Shared Deep Decoding Model**: Used to decode cross-subject features into the target feature space. - **Multimodal Supervision**: Utilizes visual and textual supervision for multimodal brain decoding. - **High-Low Level Perception Fusion**: Combines high-level perception decoding pipeline and pixel-level reconstruction pipeline to simulate bottom-up and top-down processes in neuroscience. Through these methods, the authors hope to improve the robustness and generalization ability of the model in the context of data scarcity and achieve performance comparable to or better than existing state-of-the-art methods across various tasks.