SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals

Rahul Thapa,Bryan He,Magnus Ruud Kjaer,Hyatt Moore,Gauri Ganjoo,Emmanuel Mignot,James Zou

2024-05-28

Abstract:Sleep is a complex physiological process evaluated through various modalities recording electrical brain, cardiac, and respiratory activities. We curate a large polysomnography dataset from over 14,000 participants comprising over 100,000 hours of multi-modal sleep recordings. Leveraging this extensive dataset, we developed SleepFM, the first multi-modal foundation model for sleep analysis. We show that a novel leave-one-out approach for contrastive learning significantly improves downstream task performance compared to representations from standard pairwise contrastive learning. A logistic regression model trained on SleepFM's learned embeddings outperforms an end-to-end trained convolutional neural network (CNN) on sleep stage classification (macro AUROC 0.88 vs 0.72 and macro AUPRC 0.72 vs 0.48) and sleep disordered breathing detection (AUROC 0.85 vs 0.69 and AUPRC 0.77 vs 0.61). Notably, the learned embeddings achieve 48% top-1 average accuracy in retrieving the corresponding recording clips of other modalities from 90,000 candidates. This work demonstrates the value of holistic multi-modal sleep modeling to fully capture the richness of sleep recordings. SleepFM is open source and available at <a class="link-external link-https" href="https://github.com/rthapa84/sleepfm-codebase" rel="external noopener nofollow">this https URL</a>.

Machine Learning,Artificial Intelligence,Signal Processing

What problem does this paper attempt to address?

The paper aims to address several key issues in sleep data analysis, particularly in tasks such as improving sleep stage classification and Sleep Disordered Breathing (SDB) detection using multimodal representation learning. Specifically, the research team developed a multimodal foundation model named SleepFM to integrate various physiological data such as Brain Activity Signals (BAS), Electrocardiogram (ECG), and respiratory signals to enhance the accuracy of automatic sleep record analysis. The main objectives include: 1. **Developing SleepFM**: A multimodal foundation model based on Contrastive Learning (CL), trained on a large-scale Polysomnography (PSG) dataset, aiming to capture the synergistic effects between different modalities to learn more robust physiological representations. 2. **Proposing a new contrastive learning method**: Introducing a novel "Leave-One-Out" contrastive learning strategy, which significantly improves downstream task performance by contrasting the embedding of one modality with the average embedding of the remaining modalities. 3. **Evaluating the effectiveness of SleepFM**: Validating the advantages of SleepFM over end-to-end trained Convolutional Neural Network (CNN) models through various downstream tasks (such as sleep stage classification and SDB detection) and demonstrating SleepFM's superior performance in these tasks. 4. **Exploring the model's generalization ability**: Conducting tasks such as age prediction, gender classification, and retrieval analysis experiments to further prove the quality of SleepFM embeddings. 5. **Conducting few-shot evaluation**: Assessing the model's performance on limited datasets to understand its effectiveness in scenarios with potential sample scarcity in real-world applications. In summary, SleepFM aims to advance the diagnosis and monitoring technologies for sleep-related diseases by leveraging the unique advantages of multimodal data, thereby improving the accuracy and efficiency of automatic analysis.

SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals

wav2sleep: A Unified Multi-Modal Approach to Sleep Stage Classification from Physiological Signals

CareSleepNet: A Hybrid Deep Learning Network for Automatic Sleep Staging

Machine learning-empowered sleep staging classification using multi-modality signals

Multi-channel EEG-based Sleep Stage Classification with Joint Collaborative Representation and Multiple Kernel Learning

Multi-Modal Sleep Stage Classification With Two-Stream Encoder-Decoder

CoRe-Sleep: A Multimodal Fusion Framework for Time Series Robust to Imperfect Modalities

Extracting Multi-Scale and Salient Features by MSE Based U-Structure and CBAM for Sleep Staging

Multi-channel EEG-based sleep staging using brain functional connectivity and domain adaptation

An End-to-End Multi-Channel Convolutional Bi-LSTM Network for Automatic Sleep Stage Detection

MMASleepNet: A multimodal attention network based on electrophysiological signals for automatic sleep staging

Simplifying Multimodal With Single EOG Modality for Automatic Sleep Staging

MaskSleepNet: A Cross-modality Adaptation Neural Network for Heterogeneous Signals Processing in Sleep Staging

Sleep Stage Classification Using Covariance Features of Multi-Channel Physiological Signals on Riemannian Manifolds

Enhanced Chameleon Swarm Optimization-Based Ensemble Learning Approach For Sleep Stage Classification Framework

Ubi-SleepNet: Advanced Multimodal Fusion Techniques for Three-stage Sleep Classification Using Ubiquitous Sensing

Transparency in Sleep Staging: Deep Learning Method for EEG Sleep Stage Classification with Model Interpretability

An automatic method using MFCC features for sleep stage classification

A generative foundation model for five-class sleep staging with arbitrary sensor input

Ensemble of Convolution Neural Networks on Heterogeneous Signals for Sleep Stage Scoring