A Residual Based Attention Model for EEG Based Sleep Staging

Wei Qu,Zhiyong Wang,Hong Hong,Zheru Chi,David Dagan Feng,Ron Grunstein,Christopher Gordon
DOI: https://doi.org/10.1109/JBHI.2020.2978004
IF: 7.7
2020-01-01
IEEE Journal of Biomedical and Health Informatics
Abstract:Sleep staging is to score the sleep state of a subject into different sleep stages such as Wake and Rapid Eye Movement (REM). It plays an indispensable role in the diagnosis and treatment of sleep disorders. As manual sleep staging through well-trained sleep experts is time consuming, tedious, and subjective, many automatic methods have been developed for accurate, efficient, and objective sleep staging. Recently, deep learning based methods have been successfully proposed for electroencephalogram (EEG) based sleep staging with promising results. However, most of these methods directly take EEG raw signals as input of convolutional neural networks (CNNs) without considering the domain knowledge of EEG staging. Apart from that, to capture temporal information, most of the existing methods utilize recurrent neural networks such as LSTM (Long Short Term Memory) which are not effective for modelling global temporal context and difficult to train. Therefore, inspired by the clinical guidelines of sleep staging such as AASM (American Academy of Sleep Medicine) rules where different stages are generally characterized by EEG waveforms of various frequencies, we propose a multi-scale deep architecture by decomposing an EEG signal into different frequency bands as input to CNNs. To model global temporal context, we utilize the multi-head self-attention module of the transformer model to not only improve performance, but also shorten the training time. In addition, we choose residual based architecture which makes training end-to-end. Experimental results on two widely used sleep staging datasets, Montreal Archive of Sleep Studies (MASS) and sleep-EDF datasets, demonstrate the effectiveness and significant efficiency (up to 12 times less training time) of our proposed method over the state-of-the-art.
What problem does this paper attempt to address?