Abstract:Recent studies demonstrate the use of a two-stage supervised framework to generate images that depict human perception to visual stimuli from EEG, referring to EEG-visual reconstruction. They are, however, unable to reproduce the exact visual stimulus, since it is the human-specified annotation of images, not their data, that determines what the synthesized images are. Moreover, synthesized images often suffer from noisy EEG encodings and unstable training of generative models, making them hard to recognize. Instead, we present a single-stage EEG-visual retrieval paradigm where data of two modalities are correlated, as opposed to their annotations, allowing us to recover the exact visual stimulus for an EEG clip. We maximize the mutual information between the EEG encoding and associated visual stimulus through optimization of a contrastive self-supervised objective, leading to two additional benefits. One, it enables EEG encodings to handle visual classes beyond seen ones during training, since learning is not directed at class annotations. In addition, the model is no longer required to generate every detail of the visual stimulus, but rather focuses on cross-modal alignment and retrieves images at the instance level, ensuring distinguishable model output. Empirical studies are conducted on the largest single-subject EEG dataset that measures brain activities evoked by image stimuli. We demonstrate the proposed approach completes an instance-level EEG-visual retrieval task which existing methods cannot. We also examine the implications of a range of EEG and visual encoder structures. Furthermore, for a mostly studied semantic-level EEG-visual classification task, despite not using class annotations, the proposed method outperforms state-of-the-art supervised EEG-visual reconstruction approaches, particularly on the capability of open class recognition.

See What You See: Self-supervised Cross-modal Retrieval of Visual Stimuli from Brain Activity

Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion

Seeing through the Brain: Image Reconstruction of Visual Perception from Human Brain Signals

BrainVis: Exploring the Bridge between Brain and Visual Signals via Image Reconstruction

Reconstructing Visual Stimulus Images from EEG Signals Based on Deep Visual Representation Model

EIT-1M: One Million EEG-Image-Text Pairs for Human Visual-textual Recognition and More

CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information

EEG-based Decoding of Selective Visual Attention in Superimposed Videos

Brain State Decoding for Rapid Image Retrieval

Decoding Natural Images from EEG for Object Recognition

Mind's Eye: Image Recognition by EEG via Multimodal Similarity-Keeping Contrastive Learning

A Visual EEG Paradigm and Dataset for Recognizing the Size Transformation of Images

EEG-ImageNet: An Electroencephalogram Dataset and Benchmarks with Image Visual Stimuli of Multi-Granularity Labels

Self-supervised contrastive learning for EEG-based cross-subject motor imagery recognition

EEG2IMAGE: Image Reconstruction from EEG Brain Signals

Mapping EEG Signals to Visual Stimuli: A Deep Learning Approach to Match Vs. Mismatch Classification

Image classification and reconstruction from low-density EEG

A large and rich EEG dataset for modeling human visual object recognition

Brain decoding: toward real-time reconstruction of visual perception

Research on EEG Feature Decoding Based on Stimulus Image

An Attention-based Bi-LSTM Method for Visual Object Classification Via EEG