LPSGM: A Unified Flexible Large PSG Model for Sleep Staging and Mental Disorder Diagnosis

Guifeng Deng,Mengfan Niu,Yuxi Luo,Shuying Rao,Jing Sun,Junyi Xie,Zhenghe Yu,Wenjuan Liu,Sha Zhao,Gang Pan,Xiaojing Li,Wei Deng,Wanjuan Guo,Tao Li,Haiteng Jiang
DOI: https://doi.org/10.1101/2024.12.11.24318815
2024-12-11
Abstract:We present the Large PSG Model (LPSGM), a unified and flexible framework for sleep staging and disease diagnosis using polysomnography (PSG) data. LPSGM is designed to address the challenges of cross-center generalization in sleep staging and to enable fine-tuning for downstream disease diagnosis tasks. LPSGM introduces a unified training framework for heterogeneous datasets and allows flexible channel input adjustments during inference. The model is first trained on 220,500 hours whole-night PSG from 16 public datasets, achieving robust sleep staging performance. It is then fine-tuned on target center data for various disease classification tasks, including narcolepsy diagnosis, anxiety and depression detection, and the classification of healthy versus depressed individuals. LPSGM outperforms baseline models on both sleep staging and disease diagnosis tasks. Our results demonstrate that LPSGM not only enhances sleep staging accuracy but also improves the diagnosis of sleep-related and psychiatric disorders, showing promise for clinical applications in sleep medicine and psychiatry.
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are **the generalization problem of cross - center sleep staging** and **the disease diagnosis problem based on polysomnography (PSG) data**. Specifically, the author proposes a unified and flexible large - scale PSG model (LPSGM), aiming to address the following challenges: 1. **Cross - center generalization problem**: Existing deep - learning methods have significant domain gaps between datasets from different centers, resulting in a decline in the performance of the model on new datasets. LPSGM addresses heterogeneous datasets through the introduction of a unified training framework and improves the generalization ability of the model. 2. **Fine - tuning for disease diagnosis tasks**: In addition to sleep staging, LPSGM can also be fine - tuned on PSG data of specific diseases for the diagnosis of mental diseases such as narcolepsy, anxiety disorders, and depression. ### Main contributions 1. **Unified training framework**: LPSGM can handle datasets with different channel configurations during the training phase, thereby effectively integrating multiple large - scale, multi - center public datasets and bridging the domain gaps between different centers. 2. **Flexibility in the inference stage**: LPSGM can balance accuracy and inference speed by adjusting the number of input channels during inference without changing the model structure, making it suitable for various application scenarios. 3. **Cross - center generalization ability**: Experimental results show that LPSGM performs excellently in cross - center tests. Even when applied to previously unseen datasets, it can achieve accuracy comparable to that of fully - supervised training, demonstrating its "plug - and - play" ability and potential for clinical applications. 4. **Fine - tuning ability for disease diagnosis**: LPSGM can be pre - trained on large - scale public datasets and then fine - tuned on PSG data of specific diseases to improve diagnostic accuracy, demonstrating its broad application potential in clinical diagnosis. ### Formula representation To ensure the correctness and readability of the formulas, the following are some of the key formulas involved in the paper: - **Feature vector embedding**: \[ \tilde{e}_{i,l,c}=e_{i,l,c}\oplus c_e^c\oplus t_e^l \] where $\oplus$ represents the concatenation operation along the feature dimension, $e_{i,l,c}$ is the original feature vector from the epoch encoder, $c_e^c$ is the channel embedding in the channel embedding list $CE$, and $t_e^l$ is the time embedding in the time embedding list $TE$. - **Binary mask generation**: \[ M_{i,j}= \begin{cases} 1, & j > C_i\times T_i\\ 0, & j\leq C_i\times T_i \end{cases} \] - **Encoding process of Transformer block**: \[ E_\ell' = MSA(LN(E_{\ell - 1}))+E_{\ell - 1},\quad \ell = 1,\dots,N \] \[ E_\ell = FFN(LN(E_\ell'))+E_\ell',\quad \ell = 1,\dots,N \] \[ E_{out}=LN(E_N) \] These formulas show the key steps and technical details of LPSGM in processing PSG data.