MindShot: Brain Decoding Framework Using Only One Image

Shuai Jiang,Zhu Meng,Delong Liu,Haiwen Li,Fei Su,Zhicheng Zhao
2024-05-24
Abstract:Brain decoding, which aims at reconstructing visual stimuli from brain signals, primarily utilizing functional magnetic resonance imaging (fMRI), has recently made positive progress. However, it is impeded by significant challenges such as the difficulty of acquiring fMRI-image pairs and the variability of individuals, etc. Most methods have to adopt the per-subject-per-model paradigm, greatly limiting their applications. To alleviate this problem, we introduce a new and meaningful task, few-shot brain decoding, while it will face two inherent difficulties: 1) the scarcity of fMRI-image pairs and the noisy signals can easily lead to overfitting; 2) the inadequate guidance complicates the training of a robust encoder. Therefore, a novel framework named MindShot, is proposed to achieve effective few-shot brain decoding by leveraging cross-subject prior knowledge. Firstly, inspired by the hemodynamic response function (HRF), the HRF adapter is applied to eliminate unexplainable cognitive differences between subjects with small trainable parameters. Secondly, a Fourier-based cross-subject supervision method is presented to extract additional high-level and low-level biological guidance information from signals of other subjects. Under the MindShot, new subjects and pretrained individuals only need to view images of the same semantic class, significantly expanding the model's applicability. Experimental results demonstrate MindShot's ability of reconstructing semantically faithful images in few-shot scenarios and outperforms methods based on the per-subject-per-model paradigm. The promising results of the proposed method not only validate the feasibility of few-shot brain decoding but also provide the possibility for the learning of large models under the condition of reducing data dependence.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the problem of achieving few-shot visual reconstruction in brain signal decoding. Specifically, the paper investigates several major challenges present in existing brain decoding methods: 1. **Difficulty in Data Acquisition**: High-quality functional magnetic resonance imaging (fMRI) and image pairs are difficult to obtain, which limits the training of models. 2. **Individual Differences**: There are significant differences in brain signals between different individuals, leading to the need for existing methods to train models individually for each person. This approach is not only costly and time-consuming but also severely limits its practical application. 3. **Risk of Overfitting**: In few-shot scenarios, models are prone to overfitting, especially in complex tasks like brain signal decoding. To address these challenges, the paper proposes a new framework named MindShot, which aims to achieve effective few-shot brain signal decoding by leveraging cross-individual prior knowledge. The main innovations of the MindShot framework include: - **HRF Adapter**: Inspired by the Hemodynamic Response Function (HRF), a lightweight HRF adapter is designed to reduce cognitive differences between individuals and effectively lower training costs and the risk of overfitting. - **Fourier Transform-based Cross-Individual Supervision**: By extracting high-level and low-level features from the biological signals of other individuals through Fourier Transform, effective supervisory information is provided for the visual decoding of new individuals. Experimental results show that MindShot can generate semantically faithful images in few-shot scenarios and significantly outperforms existing methods based on the "per individual per model" paradigm across multiple evaluation metrics. These results not only validate the feasibility of few-shot brain signal decoding but also provide the possibility of training large models with reduced data dependency.