Mind-bridge: Reconstructing Visual Images Based on Diffusion Model from Human Brain Activity

Qing Liu,Hongqing Zhu,Ning Chen,Bingcang Huang,Weiping Lu,Ying Wang
DOI: https://doi.org/10.1007/s11760-024-03207-z
IF: 1.583
2024-01-01
Signal Image and Video Processing
Abstract:Human brain vision is mysterious and complex, and it interprets the world through the connection between the brain and the eyes. In recent years, several methods have relied on fMRI to successfully reconstruct visual images from human brain activity. However, these reconstruction methods focus more on the semantics of the reconstruction image and lack attention to the image structure and foreground targets. To alleviate this problem, we propose a diffusion model-based image reconstruction architecture (Mind-Bridge) that utilizes fMRI to reconstruct visual images from human brain activity. Specifically, we first develop a novel Depth Structure Variational AutoEncoder (DSVAE) to capture image structural information at the initial stage. To obtain more foreground target information, we further introduce Edge estimation through the edge detection operator. In addition, we utilize Contrastive Language Image Pre-training (CLIP) text and image encoders as image and text prompt conditions for visual reconstruction. Finally, our proposed Mind-Bridge utilizes the Versatile Diffusion (VD) to fuse different stages of image information for visual images reconstruction. Qualitative and quantitative analysis results on the challenging Natural Scene Dataset (NSD) show that our proposed Mind-Bridge is effective.
What problem does this paper attempt to address?