Reconstruction of Natural Images from Human fMRI Using a Three-Stage Multi-Level Deep Fusion Model
Lu Meng,Zhenxuan Tang,Yangqian Liu
DOI: https://doi.org/10.1016/j.jneumeth.2024.110269
2024-08-31
Abstract:Background: Image reconstruction is a critical task in brain decoding research, primarily utilizing functional magnetic resonance imaging (fMRI) data. However, due to challenges such as limited samples in fMRI data, the quality of reconstruction results often remains poor. New method: We proposed a three-stage multi-level deep fusion model (TS-ML-DFM). The model employed a three-stage training process, encompassing components such as image encoders, generators, discriminators, and fMRI encoders. In this method, we incorporated distinct supplementary features derived separately from depth images and original images. Additionally, the method integrated several components, including a random shift module, dual attention module, and multi-level feature fusion module. Results: In both qualitative and quantitative comparisons on the Horikawa17 and VanGerven10 datasets, our method exhibited excellent performance. Comparison with existing methods: For example, on the primary Horikawa17 dataset, our method was compared with other leading methods based on metrics the average hash value, histogram similarity, mutual information, structural similarity accuracy, AlexNet(2), AlexNet(5), and pairwise human perceptual similarity accuracy. Compared to the second-ranked results in each metric, the proposed method achieved improvements of 0.99%, 3.62%, 3.73%, 2.45%, 3.51%, 0.62%, and 1.03%, respectively. In terms of the SwAV top-level semantic metric, a substantial improvement of 10.53% was achieved compared to the second-ranked result in the pixel-level reconstruction methods. Conclusions: The TS-ML-DFM method proposed in this study, when applied to decoding brain visual patterns using fMRI data, has outperformed previous algorithms, thereby facilitating further advancements in research within this field.