EEGFormer: Towards Transferable and Interpretable Large-Scale EEG Foundation Model

Yuqi Chen,Kan Ren,Kaitao Song,Yansen Wang,Yifan Wang,Dongsheng Li,Lili Qiu
2024-01-12
Abstract:Self-supervised learning has emerged as a highly effective approach in the fields of natural language processing and computer vision. It is also applicable to brain signals such as electroencephalography (EEG) data, given the abundance of available unlabeled data that exist in a wide spectrum of real-world medical applications ranging from seizure detection to wave analysis. The existing works leveraging self-supervised learning on EEG modeling mainly focus on pretraining upon each individual dataset corresponding to a single downstream task, which cannot leverage the power of abundant data, and they may derive sub-optimal solutions with a lack of generalization. Moreover, these methods rely on end-to-end model learning which is not easy for humans to understand. In this paper, we present a novel EEG foundation model, namely EEGFormer, pretrained on large-scale compound EEG data. The pretrained model cannot only learn universal representations on EEG signals with adaptable performance on various downstream tasks but also provide interpretable outcomes of the useful patterns within the data. To validate the effectiveness of our model, we extensively evaluate it on various downstream tasks and assess the performance under different transfer settings. Furthermore, we demonstrate how the learned model exhibits transferable anomaly detection performance and provides valuable interpretability of the acquired patterns via self-supervised learning.
Signal Processing,Artificial Intelligence,Machine Learning,Multimedia,Neurons and Cognition
What problem does this paper attempt to address?
The main goal of this paper is to address the following two core issues: 1. **Development of Large-Scale Unsupervised Pretraining Models**: Existing research on using self-supervised learning methods to process electroencephalogram (EEG) data typically pretrains for a single downstream task, failing to fully leverage the advantages of large amounts of unlabeled data. Therefore, this paper proposes a new foundational EEG model, EEGFormer, which is pretrained on large-scale composite EEG data to learn representations that are general and adaptable to various downstream tasks. 2. **Improving Model Interpretability**: Although current end-to-end learning methods are effective, their internal mechanisms are difficult to understand in medical applications, which may lead to unsafe outcomes. EEGFormer introduces a discrete representation learning algorithm, which not only improves the model's generalization ability but also enhances the interpretability of useful patterns in the data. In summary, EEGFormer aims to improve the effectiveness and reliability of EEG signal processing through large-scale unsupervised pretraining and enhanced interpretability methods.