EEG2TEXT: Open Vocabulary EEG-to-Text Decoding with EEG Pre-Training and Multi-View Transformer

Hanwen Liu,Daniel Hajialigol,Benny Antony,Aiguo Han,Xuan Wang
2024-05-03
Abstract:Deciphering the intricacies of the human brain has captivated curiosity for centuries. Recent strides in Brain-Computer Interface (BCI) technology, particularly using motor imagery, have restored motor functions such as reaching, grasping, and walking in paralyzed individuals. However, unraveling natural language from brain signals remains a formidable challenge. Electroencephalography (EEG) is a non-invasive technique used to record electrical activity in the brain by placing electrodes on the scalp. Previous studies of EEG-to-text decoding have achieved high accuracy on small closed vocabularies, but still fall short of high accuracy when dealing with large open vocabularies. We propose a novel method, EEG2TEXT, to improve the accuracy of open vocabulary EEG-to-text decoding. Specifically, EEG2TEXT leverages EEG pre-training to enhance the learning of semantics from EEG signals and proposes a multi-view transformer to model the EEG signal processing by different spatial regions of the brain. Experiments show that EEG2TEXT has superior performance, outperforming the state-of-the-art baseline methods by a large margin of up to 5% in absolute BLEU and ROUGE scores. EEG2TEXT shows great potential for a high-performance open-vocabulary brain-to-text system to facilitate communication.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper mainly focuses on improving the decoding accuracy from electroencephalogram (EEG) signals to text, especially for handling large-scale open vocabulary. The current EEG-to-text decoding methods perform well on small-scale closed vocabulary, but lack accuracy on large-scale open vocabulary. The researchers propose a new method called EEG2T EXT, which utilizes EEG pre-training to enhance the learning of semantic information from EEG signals, and adopts a multi-view transformer to simulate the processing of EEG signals in different spatial regions of the brain. The key points mentioned in the paper include: 1. EEG2T EXT introduces a convolutional neural network (CNN) module to handle long EEG signals and improve the model's processing capability. 2. Through the pre-training step, the model can reconstruct input data from randomly masked EEG signals, resulting in a better learning of EEG signal semantics. 3. The proposed multi-view transformer architecture utilizes each view transformer to encode different regions of the brain, taking into account the differential roles of these regions in language processing. Experimental results demonstrate that EEG2T EXT achieves a 5% improvement over existing state-of-the-art baseline methods in terms of absolute BLEU and ROUGE scores, indicating superior performance in the EEG-to-text decoding task for open vocabulary. This approach holds promise for supporting high-performance brain-to-text systems for open vocabulary, promoting communication and discourse. The researchers also open-source their code and dataset to foster future research in this field.