NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

Jingyang Huo,Yikai Wang,Xuelin Qian,Yun Wang,Chong Li,Jianfeng Feng,Yanwei Fu

2024-07-18

Abstract:Recent fMRI-to-image approaches mainly focused on associating fMRI signals with specific conditions of pre-trained diffusion models. These approaches, while producing high-quality images, capture only a limited aspect of the complex information in fMRI signals and offer little detailed control over image creation. In contrast, this paper proposes to directly modulate the generation process of diffusion models using fMRI signals. Our approach, NeuroPictor, divides the fMRI-to-image process into three steps: i) fMRI calibrated-encoding, to tackle multi-individual pre-training for a shared latent space to minimize individual difference and enable the subsequent multi-subject training; ii) fMRI-to-image multi-subject pre-training, perceptually learning to guide diffusion model with high- and low-level conditions across different individuals; iii) fMRI-to-image single-subject refining, similar with step ii but focus on adapting to particular individual. NeuroPictor extracts high-level semantic features from fMRI signals that characterizing the visual stimulus and incrementally fine-tunes the diffusion model with a low-level manipulation network to provide precise structural instructions. By training with about 67,000 fMRI-image pairs from various individuals, our model enjoys superior fMRI-to-image decoding capacity, particularly in the within-subject setting, as evidenced in benchmark datasets. Our code and model are available at <a class="link-external link-https" href="https://jingyanghuo.github.io/neuropictor/" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition,Machine Learning

What problem does this paper attempt to address?

This paper aims to address the problem of accurately reconstructing images from functional magnetic resonance imaging (fMRI) signals. Specifically, existing methods primarily focus on associating fMRI signals with specific conditions of pre-trained diffusion models. Although these methods can generate high-quality images, they can only capture limited information from fMRI signals and lack detailed control in image creation. To address these issues, this paper proposes the NeuroPictor framework, which improves the fMRI-to-image reconstruction process through multi-subject pre-training and multi-level modulation. NeuroPictor is divided into three steps: 1. **fMRI Calibration Encoding**: Establishing a universal fMRI latent space through multi-subject pre-training to minimize individual differences. 2. **Multi-Subject Pre-Training**: Utilizing approximately 67,000 fMRI-image pairs from different individuals for pre-training to guide the learning of the diffusion model. 3. **Single-Subject Refinement**: Further fine-tuning for specific individuals based on multi-subject pre-training to enhance individual specificity. The core of NeuroPictor lies in its ability to not only extract high-level semantic features from fMRI signals but also provide precise structural instructions through low-level network manipulation, thereby achieving high-quality reconstruction from fMRI signals to images. Experimental results show that NeuroPictor demonstrates superior performance on multiple benchmark datasets, especially in intra-subject settings.

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

See Through Their Minds: Learning Transferable Neural Representation from Cross-Subject fMRI

Reconstruction of Natural Images from Human fMRI Using a Three-Stage Multi-Level Deep Fusion Model

Single-subject Multi-contrast MRI Super-resolution via Implicit Neural Representations

Reconstructing Retinal Visual Images from 3T fMRI Data Enhanced by Unsupervised Learning

Brain3D: Generating 3D Objects from fMRI

MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction

UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity

MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data

fMRI-PTE: A Large-scale fMRI Pretrained Transformer Encoder for Multi-Subject Brain Activity Decoding

NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties

Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors

Efficient Neural Decoding Based on Multimodal Training

Neural Pre-Processing: A Learning Framework for End-to-end Brain MRI Pre-processing

Reconstructing Natural Images from Human Fmri by Alternating Encoding and Decoding with Shared Autoencoder Regularization

MindBridge: A Cross-Subject Brain Decoding Framework

Functional diversity of visual cortex improves constraint-free natural image reconstruction from human brain activity

Mind-bridge: Reconstructing Visual Images Based on Diffusion Model from Human Brain Activity

MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion

MindLDM: Reconstruct Visual Stimuli from Fmri Using Latent Diffusion Model

A Hybrid Spatio-Temporal Deep Belief Network and Sparse Representation-Based Framework Reveals Multi-Level Core Functional Components in Decoding Multi-Task fMRI Signals