Towards Foundation Models for Critical Care Time Series

Manuel Burger,Fedor Sergeev,Malte Londschien,Daphné Chopard,Hugo Yèche,Eike Gerdes,Polina Leshetkina,Alexander Morgenroth,Zeynep Babür,Jasmina Bogojeska,Martin Faltys,Rita Kuznetsova,Gunnar Rätsch

2024-11-25

Abstract:Notable progress has been made in generalist medical large language models across various healthcare areas. However, large-scale modeling of in-hospital time series data - such as vital signs, lab results, and treatments in critical care - remains underexplored. Existing datasets are relatively small, but combining them can enhance patient diversity and improve model robustness. To effectively utilize these combined datasets for large-scale modeling, it is essential to address the distribution shifts caused by varying treatment policies, necessitating the harmonization of treatment variables across the different datasets. This work aims to establish a foundation for training large-scale multi-variate time series models on critical care data and to provide a benchmark for machine learning models in transfer learning across hospitals to study and address distribution shift challenges. We introduce a harmonized dataset for sequence modeling and transfer learning research, representing the first large-scale collection to include core treatment variables. Future plans involve expanding this dataset to support further advancements in transfer learning and the development of scalable, generalizable models for critical healthcare applications.

Machine Learning

What problem does this paper attempt to address?

The key problem that this paper attempts to solve is the challenges faced in training large - scale multivariate time - series models on intensive - care time - series data (such as vital signs, laboratory results, and treatment measures). Specifically, the authors focus on the following issues: 1. **Insufficient dataset size and diversity**: Existing intensive - care time - series datasets are relatively small and mainly come from a single medical center. This limits the generalization ability and robustness of the model. By integrating multiple datasets, the diversity and number of patient samples can be significantly increased. 2. **Distribution shift problem**: There are significant differences in recording formats and treatment policies between different hospitals and countries, resulting in poor performance of the model on cross - hospital or cross - country data. Solving these distribution shift problems is the key to building a robust base model. 3. **Lack of a unified benchmark test**: Most previous studies have focused on data from a single center and lack a comprehensive evaluation of multi - center data. Therefore, it is necessary to establish a comprehensive benchmark test framework to evaluate the performance of different machine - learning models on intensive - care time - series data. To solve these problems, the authors propose a work aimed at: - Creating a large, multi - center intensive - care time - series dataset that covers a wide range of clinical features and standardizes core treatment variables. - Establishing a comprehensive benchmark test framework to evaluate the performance of various machine - learning models on the new dataset, especially in the case of distribution shift across hospitals and countries. Through these efforts, the authors hope to lay the foundation for future base - model research and promote the application of deep learning in the field of intensive care, especially for few - shot learning and fine - tuning tasks for small - scale specific patient groups.

Towards Foundation Models for Critical Care Time Series

Extracting Dynamic Information of Temporal Clinical Data to Predict the Outcome in Critically Ill Patients.

Critical Care Studies Using Large Language Models Based on Electronic Healthcare Records: A Technical Note

Evaluation of a Data Annotation Platform for Large, Time-Series Datasets in Intensive Care: Mixed Methods Study

Predicting Abnormalities in Laboratory Values of Patients in the Intensive Care Unit Using Different Deep Learning Models: Comparative Study

Intensive Care as One Big Sequence Modeling Problem

Establishment of a Chinese Critical Care Database from Electronic Healthcare Records in a Tertiary Care Medical Center

Predictive modeling of biomedical temporal data in healthcare applications: review and future directions

Das „Neonatale small left colon syndrom”

Evidence for new isomers and band structures in 80Rb.

Leveraging Patient Similarity and Time Series Data in Healthcare Predictive Models

Filling the gaps: leveraging large language models for temporal harmonization of clinical text across multiple medical visits for clinical prediction

Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML

Feature Robustness in Non-stationary Health Records: Caveats to Deployable Model Performance in Common Clinical Machine Learning Tasks

A Computer Aided System for Developing Graphical Telematic Applications

Early Prediction of Causes (not Effects) in Healthcare by Long-Term Clinical Time Series Forecasting

Leveraging Clinical Time-Series Data for Prediction: A Cautionary Tale

Generalized Prompt Tuning: Adapting Frozen Univariate Time Series Foundation Models for Multivariate Healthcare Time Series

Scalable Predictive Analysis in Critically Ill Patients Using a Visual Open Data Analysis Platform

Global Contrastive Training for Multimodal Electronic Health Records with Language Supervision

The Shaky Foundations of Clinical Foundation Models: A Survey of Large Language Models and Foundation Models for EMRs