Deep Representation Learning-Based Dynamic Trajectory Phenotyping for Acute Respiratory Failure in Medical Intensive Care Units

Alan Wu,Tilendra Choudhary,Pulakesh Upadhyaya,Ayman Ali,Philip Yang,Rishikesan Kamaleswaran
2024-05-04
Abstract:Sepsis-induced acute respiratory failure (ARF) is a serious complication with a poor prognosis. This paper presents a deep representation learningbased phenotyping method to identify distinct groups of clinical trajectories of septic patients with ARF. For this retrospective study, we created a dataset from electronic medical records (EMR) consisting of data from sepsis patients admitted to medical intensive care units who required at least 24 hours of invasive mechanical ventilation at a quarternary care academic hospital in southeast USA for the years 2016-2021. A total of N=3349 patient encounters were included in this study. Clustering Representation Learning on Incomplete Time Series Data (CRLI) algorithm was applied to a parsimonious set of EMR variables in this data set. To validate the optimal number of clusters, the K-means algorithm was used in conjunction with dynamic time warping. Our model yielded four distinct patient phenotypes that were characterized as liver dysfunction/heterogeneous, hypercapnia, hypoxemia, and multiple organ dysfunction syndrome by a critical care expert. A Kaplan-Meier analysis to compare the 28-day mortality trends exhibited significant differences (p < 0.005) between the four phenotypes. The study demonstrates the utility of our deep representation learning-based approach in unraveling phenotypes that reflect the heterogeneity in sepsis-induced ARF in terms of different mortality outcomes and severity. These phenotypes might reveal important clinical insights into an effective prognosis and tailored treatment strategies.
Signal Processing,Machine Learning
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to identify clinical trajectory subtypes of patients with acute respiratory failure (ARF) caused by sepsis in the intensive care unit (ICU) through deep representation learning methods. Specifically, the researchers attempt to address the following issues: 1. **Identify different patient subtypes**: Utilize deep learning techniques to extract features from multivariate time series data and identify patient subtypes with different clinical trajectories. 2. **Improve prognosis accuracy**: Help doctors more accurately predict patient prognosis by identifying different subtypes. 3. **Develop personalized treatment plans**: Formulate more personalized treatment strategies for patients based on different subtypes to improve treatment outcomes. 4. **Understand the heterogeneity of ARF**: Reveal the heterogeneity of ARF caused by sepsis among different subtypes, including different mortality rates and severity levels. ### Research Background Sepsis is a severe syndrome defined as life-threatening organ dysfunction caused by a dysregulated immune response to infection. Patients with sepsis often develop acute respiratory failure (ARF) in the ICU, requiring mechanical ventilation support. The prognosis of ARF is poor, with a mortality rate as high as 43%, posing a significant challenge to critical care specialists. The diverse etiology, physiological processes, and immune responses of ARF lead to highly heterogeneous clinical trajectories, making them difficult to interpret and characterize. ### Method Overview The researchers employed a deep learning algorithm called Clustering Representation Learning on Incomplete Time Series Data (CRLI) to perform clustering analysis on multivariate time series data collected from electronic medical records (EMR). The specific steps include: 1. **Data Collection**: Data from 3349 patients were collected from a quaternary care academic hospital in the southeastern United States between 2016 and 2021. These patients were admitted to the ICU for sepsis and required at least 24 hours of mechanical ventilation support. 2. **Data Preprocessing**: A concise set of cardiopulmonary variables was selected, including partial pressure of oxygen (PaO2), partial pressure of carbon dioxide (PaCO2), fraction of inspired oxygen (FiO2), pulse oximetry (SpO2), heart rate (HR), and mean arterial pressure (MAP). Outlier processing, missing value imputation, and standardization were performed. 3. **Trajectory Clustering**: The CRLI algorithm was used for deep representation learning to generate optimal data-driven subtypes. Additionally, K-means combined with dynamic time warping (DTW) was used for validation. 4. **Subtype Characterization**: Critical care specialists characterized and named the generated subtypes. ### Main Findings 1. **Subtype Classification**: The researchers identified four main subtypes: - Liver Dysfunction/Heterogeneous Subtype - Hypercapnic Subtype - Hypoxemic Subtype - Multiple Organ Dysfunction Syndrome (MODS) Subtype 2. **Survival Analysis**: Kaplan-Meier curve analysis revealed significant differences in 28-day short-term survival probabilities among different subtypes. The hypercapnic subtype had the highest survival probability, while the MODS subtype had the lowest. 3. **Comorbidity Analysis**: Analysis using the Charlson Comorbidity Index (CCI) and age-adjusted Charlson Comorbidity Index (ACCI) showed significant differences in comorbidity burdens among different subtypes. For example, the MODS subtype had the highest CCI and ACCI, and the highest incidence of cardiac arrest. ### Conclusion This study demonstrates the potential of deep representation learning in identifying clinical trajectory subtypes of ARF patients caused by sepsis. These subtypes not only help improve prognosis accuracy but also guide the formulation of personalized treatment strategies, thereby improving patient outcomes. Future research can further explore the biological mechanisms of these subtypes to develop new therapeutic approaches.