Abstract:Electroencephalography (EEG)-based visual perception reconstruction has become an important area of research. Neuroscientific studies indicate that humans can decode imagined 3D objects by perceiving or imagining various visual information, such as color, shape, and rotation. Existing EEG-based visual decoding methods typically focus only on the reconstruction of 2D visual stimulus images and face various challenges in generation quality, including inconsistencies in texture, shape, and color between the visual stimuli and the reconstructed images. This paper proposes an EEG-based 3D object reconstruction method with style consistency and diffusion priors. The method consists of an EEG-driven multi-task joint learning stage and an EEG-to-3D diffusion stage. The first stage uses a neural EEG encoder based on regional semantic learning, employing a multi-task joint learning scheme that includes a masked EEG signal recovery task and an EEG based visual classification task. The second stage introduces a latent diffusion model (LDM) fine-tuning strategy with style-conditioned constraints and a neural radiance field (NeRF) optimization strategy. This strategy explicitly embeds semantic- and location-aware latent EEG codes and combines them with visual stimulus maps to fine-tune the LDM. The fine-tuned LDM serves as a diffusion prior, which, combined with the style loss of visual stimuli, is used to optimize NeRF for generating 3D objects. Finally, through experimental validation, we demonstrate that this method can effectively use EEG data to reconstruct 3D objects with style consistency.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the challenges encountered in reconstructing 3D objects based on electroencephalogram (EEG) signals, especially how to ensure that the reconstructed 3D objects are consistent in style with the visual stimuli. Specifically, the existing EEG - based visual decoding methods mainly focus on the reconstruction of 2D visual stimulus images, and there are many problems in terms of generation quality, such as insufficient consistency in texture, shape and color. #### Main problems include: 1. **Reconstructing 3D objects from EEG signals**: Most of the existing research focuses on the reconstruction of 2D images, and the reconstruction of 3D objects has not been deeply explored yet. 2. **Style consistency**: The reconstructed 3D objects need to be consistent in style with the original visual stimuli, including visual features such as color and shape. 3. **Semantic and location awareness**: In order to better capture the semantic information in EEG signals, a method that can understand regional semantic features needs to be developed. 4. **Generation quality**: Ensure that the generated 3D objects are not only accurate in geometric structure, but also highly consistent in visual style with the original stimulus images. To solve these problems, the author proposes a new framework, which combines multi - task joint learning and the diffusion model (Diffusion Model), and realizes high - quality 3D object reconstruction through neural radiance field (NeRF) optimization. Specifically, this method is divided into two stages: - **First stage**: Use the neural EEG encoder for multi - task joint learning, including the masked EEG signal recovery task and the EEG - based visual classification task, to capture regional semantic features. - **Second stage**: Introduce the latent diffusion model (LDM) fine - tuning strategy and the NeRF optimization strategy, fine - tune the LDM through style constraints and visual stimulus maps, and finally generate style - consistent 3D objects. Through experimental verification, this method can effectively use EEG data to reconstruct 3D objects with style consistency, thus promoting the research progress in the field of EEG - based visual reconstruction. ### Formula summary - **Diffusion model loss function**: \[ L_{\text{ldm}}=\mathbb{E}_{z, \epsilon \sim \mathcal{N}(0,1), t}\left[\left\|\epsilon-\epsilon_{\theta}(z_{t}, t, \tau_{\theta}(y))\right\|_{2}^{2}\right] \] - **Regional semantic loss function**: \[ L_{\text{region}} =-\frac{1}{N}\sum_{i = 1}^{N}\sum_{k = 1}^{M}p_{i,k}\cdot\log(\hat{p}_{i,k}) \] - **Comprehensive loss function**: \[ L_{\text{ldm - region}}=\lambda_{\text{ldm}}L_{\text{ldm}}+\lambda_{\text{region}}L_{\text{region}} \] These formulas are used to guide the training process of the model, ensuring that the generated 3D objects are highly consistent with the original stimulus images in both geometric structure and visual style.

EEG-Driven 3D Object Reconstruction with Style Consistency and Diffusion Prior

Neuro-3D: Towards 3D Visual Decoding from EEG Signals

Rethinking Visual Reconstruction: Experience-Based Content Completion Guided by Visual Cues

Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion

Reconstructing Visual Stimulus Images from EEG Signals Based on Deep Visual Representation Model

Seeing through the Brain: Image Reconstruction of Visual Perception from Human Brain Signals

A Multi-resolution Adaptive Encoding Method for 3-D Reconstruction of DPO-SDF

Image Reconstruction from Electroencephalography Using Latent Diffusion

Novel 3D-Aware Composition Images Synthesis for Object Display with Diffusion Model.

MindLDM: Reconstruct Visual Stimuli from Fmri Using Latent Diffusion Model

ReconFusion: 3D Reconstruction with Diffusion Priors

Decoding Realistic Images from Brain Activity with Contrastive Self-supervision and Latent Diffusion

NeuralDiffuser: Controllable fMRI Reconstruction with Primary Visual Feature Guided Diffusion

Improving visual image reconstruction from human brain activity using latent diffusion models via multiple decoded inputs

Mind-bridge: Reconstructing Visual Images Based on Diffusion Model from Human Brain Activity

DreamDiffusion: Generating High-Quality Images from Brain EEG Signals

MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion

Sparse3D: Distilling Multiview-Consistent Diffusion for Object Reconstruction from Sparse Views

DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment

BrainVis: Exploring the Bridge between Brain and Visual Signals via Image Reconstruction

Image classification and reconstruction from low-density EEG