0662 Accurate Automated Sleep Staging of Narcoleptic Patients Using a Machine Learning Model
Ahmet Cakir,David Josephs,Dave Kleinschmidt,Jay Pathmanathan,Jacob Donoghue,Alexander Chan
DOI: https://doi.org/10.1093/sleep/zsae067.0662
IF: 6.313
2024-04-20
SLEEP
Abstract:Abstract Introduction Accurate sleep staging of EEG data from polysomnography (PSG) is important in the diagnosis of narcolepsy. Human sleep staging is costly and labor intensive, but automated sleep staging algorithms must be rigorously tested in narcoleptic patients to ensure valid performance. PSGs of narcoleptic patients often tend to be more fragmented and variable than in non-narcoleptic populations, making it challenging for both humans and automated algorithms to accurately stage sleep. Here, we evaluate the performance of a deep learning model validated in a general sleep clinic population for staging nocturnal PSGs in patients with narcolepsy. Methods SleepStageMLTM, a deep-learning model for performing sleep staging on EEG signals, was trained on a large database of polysomnography recordings from a heterogenous population within the Beacon Clinico-PSG Database. The algorithm was evaluated on a held-out set of 28 overnight PSGs from patients with narcolepsy or hypersomnolence and 57 overnight PSGs from individuals without narcolepsy or hypersomnolence. Each PSG was manually scored by a human expert, and the performance of the automated algorithm was compared across the two cohorts. Results Automated sleep staging performance was high across both cohorts. The average F1-score for the control cohort and the narcolepsy cohort was 0.758 and 0.744 respectively. The positive percent agreements (PPAs) for the control cohort were 87%, 38%, 84%, 91%, and 93% for stages W, N1, N2, N3, and R respectively. For the narcolepsy cohort, the PPAs across the same stages were 91%, 33%, 81%, 86%, and 88% respectively. The algorithm’s median absolute error in estimating REM latency, REM duration, and REM percentage in the control cohort was 1.25 minutes, 8.5 minutes, and 2% points, respectively. The same metrics for the narcolepsy cohort were 2.75 minutes, 11.75 minutes, and 3% points respectively. Conclusion A deep-learning model trained on diverse data automatically and accurately staged PSGs from narcoleptic patients and was comparable to performance of a human expert. The algorithm estimated REM parameters accurately in both cohorts. Automated staging algorithms like the one described here have the potential to accelerate diagnosis and monitor therapeutic efficacy for narcolepsy treatments by more efficiently and consistently staging sleep. Support (if any)
neurosciences,clinical neurology