Abstract:Introduction: Automated sleep staging using deep learning models typically requires training on hundreds of sleep recordings, and pre-training on public databases is therefore common practice. However, suboptimal sleep stage performance may occur from mismatches between source and target datasets, such as differences in population characteristics (e.g., an unrepresented sleep disorder) or sensors (e.g., alternative channel locations for wearable EEG). Methods: We investigated three strategies for training an automated single-channel EEG sleep stager: pre-training (i.e., training on the original source dataset), training-from-scratch (i.e., training on the new target dataset), and fine-tuning (i.e., training on the original source dataset, fine-tuning on the new target dataset). As source dataset, we used the F3-M2 channel of healthy subjects (N = 94). Performance of the different training strategies was evaluated using Cohen's Kappa ( κ ) in eight smaller target datasets consisting of healthy subjects (N = 60), patients with obstructive sleep apnea (OSA, N = 60), insomnia (N = 60), and REM sleep behavioral disorder (RBD, N = 22), combined with two EEG channels, F3-M2 and F3-F4. Results: No differences in performance between the training strategies was observed in the age-matched F3-M2 datasets, with an average performance across strategies of κ = .83 in healthy, κ = .77 in insomnia, and κ = .74 in OSA subjects. However, in the RBD set, where data availability was limited, fine-tuning was the preferred method ( κ = .67), with an average increase in κ of .15 to pre-training and training-from-scratch. In the presence of channel mismatches, targeted training is required, either through training-from-scratch or fine-tuning, increasing performance with κ = .17 on average. Discussion: We found that, when channel and/or population mismatches cause suboptimal sleep staging performance, a fine-tuning approach can yield similar to superior performance compared to building a model from scratch, while requiring a smaller sample size. In contrast to insomnia and OSA, RBD data contains characteristics, either inherent to the pathology or age-related, which apparently demand targeted training.

Do Not Sleep on Traditional Machine Learning: Simple and Interpretable Techniques Are Competitive to Deep Learning for Sleep Scoring

Automatic Classification of Sleep Stages Based on Raw Single-Channel EEG

Personalizing deep learning models for automatic sleep staging

Performance and utility trade-off in interpretable sleep staging

Automatic sleep staging of EEG signals: recent development, challenges, and future directions

Deep Convolutional Neural Networks for Interpretable Analysis of EEG Sleep Stage Scoring

Automated scoring of pre-REM sleep in mice with deep learning

Automatic Sleep Scoring: A Deep Learning Architecture for Multi-Modality Time Series

Application of Machine Learning to Sleep Stage Classification

ZleepAnlystNet: a novel deep learning model for automatic sleep stage scoring based on single-channel raw EEG data using separating training

Transparency in Sleep Staging: Deep Learning Method for EEG Sleep Stage Classification with Model Interpretability

DeepSleepNet-Lite: A Simplified Automatic Sleep Stage Scoring Model with Uncertainty Estimates

DeepSleepNet: A Model for Automatic Sleep Stage Scoring Based on Raw Single-Channel EEG

Research and application of deep learning-based sleep staging: Data, modeling, validation, and clinical practice

Automated Classification of Sleep Stages and EEG Artifacts in Mice with Deep Learning

Large-Scale Automated Sleep Staging

Sleep Staging Framework with Physiologically Harmonized Sub-Networks

Automatic sleep-stage classification of heart rate and actigraphy data using deep and transfer learning approaches

S4Sleep: Elucidating the design space of deep-learning-based sleep stage classification models

Refining sleep staging accuracy: Transfer learning coupled with scorability models

Deep transfer learning for automated single-lead EEG sleep staging with channel and population mismatches