Abstract:Objective: This study investigates speech decoding from neural signals captured by intracranial electrodes. Most prior works can only work with electrodes on a 2D grid (i.e., Electrocorticographic or ECoG array) and data from a single patient. We aim to design a deep-learning model architecture that can accommodate both surface (ECoG) and depth (stereotactic EEG or sEEG) electrodes. The architecture should allow training on data from multiple participants with large variability in electrode placements and the trained model should perform well on participants unseen during training. Approach: We propose a novel transformer-based model architecture named SwinTW that can work with arbitrarily positioned electrodes, by leveraging their 3D locations on the cortex rather than their positions on a 2D grid. We train both subject-specific models using data from a single participant as well as multi-patient models exploiting data from multiple participants. Main Results: The subject-specific models using only low-density 8x8 ECoG data achieved high decoding Pearson Correlation Coefficient with ground truth spectrogram (PCC=0.817), over N=43 participants, outperforming our prior convolutional ResNet model and the 3D Swin transformer model. Incorporating additional strip, depth, and grid electrodes available in each participant (N=39) led to further improvement (PCC=0.838). For participants with only sEEG electrodes (N=9), subject-specific models still enjoy comparable performance with an average PCC=0.798. The multi- subject models achieved high performance on unseen participants, with an average PCC=0.765 in leave-one-out cross-validation. Significance: The proposed SwinTW decoder enables future speech neuropros- theses to utilize any electrode placement that is clinically optimal or feasible for a particular participant, including using only depth electrodes, which are more routinely implanted in chronic neurosurgical procedures. Importantly, the generalizability of the multi-patient models suggests the exciting possibility of developing speech neuropros- theses for people with speech disability without relying on their own neural data for training, which is not always feasible.

Decoding imagined speech from EEG signals using hybrid-scale spatial-temporal dilated convolution network

Speech neuromuscular decoding based on spectrogram images using conformal predictors with Bi-LSTM.

Decoding Imagined Speech from EEG Data: A Hybrid Deep Learning Approach to Capturing Spatial and Temporal Features

Delving into Temporal-Spectral Connections in Spike-LFP Decoding by Transformer Networks

Speech decoding from stereo-electroencephalography (sEEG) signals using advanced deep learning methods

Hierarchical Deep Feature Learning For Decoding Imagined Speech From EEG

A Novel Deep Learning Architecture for Decoding Imagined Speech from EEG

Decoding High-level Imagined Speech using Attention-based Deep Neural Networks

Dual-TSST: A Dual-Branch Temporal-Spectral-Spatial Transformer Model for EEG Decoding

Du-IN: Discrete units-guided mask modeling for decoding speech from Intracranial Neural signals

Towards Unified Neural Decoding of Perceived, Spoken and Imagined Speech from EEG Signals

Du-IN-v2: Unleashing the Power of Vector Quantization for Decoding Cognitive States from Intracranial Neural Signals

Transformer-based Spatial-Temporal Feature Learning for EEG Decoding

Delineating neural contributions to electroencephalogram-based speech decoding

Auditory attention decoding from electroencephalography based on long short-term memory networks

A Spatial Filter Temporal Graph Convolutional Network for decoding motor imagery EEG signals

Decoding Silent Reading EEG Signals Using Adaptive Feature Graph Convolutional Network

[Magnetic resonance in cerebral anoxia].

Subject-Agnostic Transformer-Based Neural Speech Decoding from Surface and Depth Electrode Signals

DAL: Feature Learning from Overt Speech to Decode Imagined Speech-based EEG Signals with Convolutional Autoencoder

Towards Neural Decoding of Imagined Speech based on Spoken Speech