Abstract:Electroencephalography (EEG)-based applications in Brain-Computer Interfaces (BCIs), neurological disease diagnosis, rehabilitation, and other areas rely on the utilization of extensive data for model development. Nevertheless, this raises concerns regarding storage and privacy, since model development needs a significant amount of data, and EEG sharing discloses sensitive information such as identity and health. To address this challenging problem, we provide the paradigm of EEG condensation, aiming to generate a synthetic sample set that is highly information-concentrated yet not visually similar. Correspondingly, we propose a novel dataset condensation framework where the knowledge of the original EEG dataset is condensed into diffusion models, named EEGCiD. Specifically, EEGCiD first utilizes a deterministic denoising diffusion implicit model (DDIM) to store the information of the original dataset and optimizes the condensation latent codes $z$ to obtain the EEG condensation dataset. Further, to enhance the modeling of EEG knowledge in DDIM, we design a transformer architecture incorporating the spatial and temporal self-attention block (STSA) to replace the traditional U-Net backbone. In the condensation phase, EEGCiD randomly initializes a subset of samples from the original dataset to obtain the condensation latent codes $z$ through the forward process in DDIM. Then, it optimizes $z$ by matching the feature distributions in multiple EEG decoding models between the synthetic samples and the original dataset. Extensive experiments across three EEG datasets demonstrate that the condensation dataset from the proposed model not only achieves superior classification performance with limited sample sizes, but also effectively prevents membership inference attacks (MIA). Note to Practitioners —This paper aims to investigate a novel EEG generation paradigm that extracts representative synthetic samples from large-scale datasets. Existing studies in EEG generation primarily concentrate on generating real-like signals, and some work claims that the generated EEG can serve as a substitute for the original dataset to achieve privacy preservation. In the EEGCiD framework, the deterministic DDIM is pre-trained with the original dataset to store the knowledge. Besides, an ensemble feature matching strategy is proposed to condense the information from the original dataset into a small latent code set. Experiments on three datasets demonstrate that EEGCiD addresses two fundamental challenges: 1) obtaining superior classification performance within a small dataset (limited storage capacity); 2) avoiding potential privacy issues during EEG sharing and transmission.

Guess What I Think: Streamlined EEG-to-Image Generation with Latent Diffusion Models

Neurocognition-inspired Design with Machine Learning

DreamDiffusion: Generating High-Quality Images from Brain EEG Signals

DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment

BrainDreamer: Reasoning-Coherent and Controllable Image Generation from EEG Brain Signals via Language Guidance

A New Framework Combining Diffusion Models and the Convolution Classifier for Generating Images from EEG Signals

Decoding visual brain representations from electroencephalography through Knowledge Distillation and latent diffusion models

Image Reconstruction from Electroencephalography Using Latent Diffusion

EEG2IMAGE: Image Reconstruction from EEG Brain Signals

Diffusion model-based image generation from rat brain activity

Human-in-the-loop design with machine learning

NECOMIMI: Neural-Cognitive Multimodal EEG-informed Image Generation with Diffusion Models

Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion

EEGminer: discovering interpretable features of brain activity with learnable filters

Retrieving and reconstructing conceptually similar images from fMRI with latent diffusion models and a neuro-inspired brain decoding model

Seeing through the Brain: Image Reconstruction of Visual Perception from Human Brain Signals

Image classification and reconstruction from low-density EEG

EEG Synthetic Data Generation Using Probabilistic Diffusion Models

EEGCiD: EEG Condensation into Diffusion Model

BrainWave-Scattering Net: A lightweight network for EEG-based motor imagery recognition