EEG2Rep: Enhancing Self-supervised EEG Representation Through Informative Masked Inputs

Navid Mohammadi Foumani,Geoffrey Mackellar,Soheila Ghane,Saad Irtza,Nam Nguyen,Mahsa Salehi
2024-06-18
Abstract:Self-supervised approaches for electroencephalography (EEG) representation learning face three specific challenges inherent to EEG data: (1) The low signal-to-noise ratio which challenges the quality of the representation learned, (2) The wide range of amplitudes from very small to relatively large due to factors such as the inter-subject variability, risks the models to be dominated by higher amplitude ranges, and (3) The absence of explicit segmentation in the continuous-valued sequences which can result in less informative representations. To address these challenges, we introduce \textit{EEG2Rep}, a self-prediction approach for self-supervised representation learning from EEG. Two core novel components of EEG2Rep are as follows: 1) Instead of learning to predict the masked input from raw EEG, EEG2Rep learns to predict masked input in latent representation space, and 2) Instead of conventional masking methods, EEG2Rep uses a new semantic subsequence preserving (SSP) method which provides informative masked inputs to guide EEG2Rep to generate rich semantic representations. In experiments on 6 diverse EEG tasks with subject variability, EEG2Rep significantly outperforms state-of-the-art methods. We show that our semantic subsequence preserving improves the existing masking methods in self-prediction literature and find that preserving 50\% of EEG recordings will result in the most accurate results on all 6 tasks on average. Finally, we show that EEG2Rep is robust to noise addressing a significant challenge that exists in EEG data. Models and code are available at:\url{<a class="link-external link-https" href="https://github.com/Navidfoumani/EEG2Rep" rel="external noopener nofollow">this https URL</a>}
Signal Processing,Machine Learning
What problem does this paper attempt to address?
This paper proposes a method to address three main challenges in EEG (electroencephalogram) self-supervised representation learning. These challenges include the decrease in representation quality due to low signal-to-noise ratio, the possibility of model domination by a wide range of amplitudes, and the lack of clear segmentation in continuous value sequences leading to insufficient representation information. To address these issues, the paper introduces a self-prediction method called EEG2Rep, which learns to predict masked inputs in the latent representation space instead of directly learning from raw EEG data. Additionally, EEG2Rep utilizes a new method called Semantic Subsequence Preservation (SSP) to provide informative masked inputs, guiding the model to generate rich semantic representations. In six EEG tasks with inter-subject variability, EEG2Rep outperforms existing methods significantly and demonstrates its robustness to noise. The paper shows that SSP method can improve existing masking methods in self-prediction literature, and it is found that retaining 50% of the EEG records can achieve the most accurate results on all six tasks on average. Through these innovations, EEG2Rep aims to enhance the self-supervised learning effectiveness of EEG data for various downstream tasks such as classification.