Abstract:Health event prediction is empowered by the rapid and wide application of electronic health records (EHR). In the Intensive Care Unit (ICU), precisely predicting the health related events in advance is essential for providing treatment and intervention to improve the patients outcomes. EHR is a kind of multi-modal data containing clinical text, time series, structured data, etc. Most health event prediction works focus on a single modality, e.g., text or tabular EHR. How to effectively learn from the multi-modal EHR for health event prediction remains a challenge. Inspired by the strong capability in text processing of large language model (LLM), we propose the framework CKLE for health event prediction by distilling the knowledge from LLM and learning from multi-modal EHR. There are two challenges of applying LLM in the health event prediction, the first one is most LLM can only handle text data rather than other modalities, e.g., structured data. The second challenge is the privacy issue of health applications requires the LLM to be locally deployed, which may be limited by the computational resource. CKLE solves the challenges of LLM scalability and portability in the healthcare domain by distilling the cross-modality knowledge from LLM into the health event predictive model. To fully take advantage of the strong power of LLM, the raw clinical text is refined and augmented with prompt learning. The embedding of clinical text are generated by LLM. To effectively distill the knowledge of LLM into the predictive model, we design a cross-modality knowledge distillation (KD) method. A specially designed training objective will be used for the KD process with the consideration of multiple modality and patient similarity. The KD loss function consists of two parts. The first one is cross-modality contrastive loss function, which models the correlation of different modalities from the same patient. The second one is patient similarity learning loss function to model the correlations between similar patients. The cross-modality knowledge distillation can distill the rich information in clinical text and the knowledge of LLM into the predictive model on structured EHR data. To demonstrate the effectiveness of CKLE, we evaluate CKLE on two health event prediction tasks in the field of cardiology, heart failure prediction and hypertension prediction. We select the 7125 patients from MIMIC-III dataset and split them into train/validation/test sets. We can achieve a maximum 4.48% improvement in accuracy compared to state-of-the-art predictive model designed for health event prediction. The results demonstrate CKLE can surpass the baseline prediction models significantly on both normal and limited label settings. We also conduct the case study on cardiology disease analysis in the heart failure and hypertension prediction. Through the feature importance calculation, we analyse the salient features related to the cardiology disease which corresponds to the medical domain knowledge. The superior performance and interpretability of CKLE pave a promising way to leverage the power and knowledge of LLM in the health event prediction in real-world clinical settings.

DKEC: Domain Knowledge Enhanced Multi-Label Classification for Diagnosis Prediction

MVKT-ECG: Efficient single-lead ECG classification for multi-label arrhythmia by multi-view knowledge transferring

Triplet attention and dual-pool contrastive learning for clinic-driven multi-label medical image classification

MVKT-ECG: Efficient Single-lead ECG Classification on Multi-Label Arrhythmia by Multi-View Knowledge Transferring

KMTLabeler: an Interactive Knowledge-Assisted Labeling Tool for Medical Text Classification

KeNet:Knowledge-enhanced Doc-Label Attention Network for Multi-label text classification

Distilling the Knowledge from Large-language Model for Health Event Prediction

Multi-label Classification for Clinical Text with Feature-level Attention

KAME: Knowledge-based Attention Model for Diagnosis Prediction in Healthcare

KG-MTT-BERT: Knowledge Graph Enhanced BERT for Multi-Type Medical Text Classification

Detecting Mental and Physical Disorders Using Multi-Task Learning Equipped with Knowledge Graph Attention Network

Multi-scale Label Attention Network Based on Abductive Causal Graph for Disease Diagnosis

Multi-Label Learning With Visual-Semantic Embedded Knowledge Graph for Diagnosis of Radiology Imaging

Auxiliary Knowledge-Induced Learning for Automatic Multi-Label Medical Document Classification

DualMAR: Medical-Augmented Representation from Dual-Expertise Perspectives

Knowledge-based Dynamic Prompt Learning for Multi-label Disease Diagnosis

K-Diag: Knowledge-enhanced Disease Diagnosis in Radiographic Imaging

Low-Resolution Chest X-ray Classification via Knowledge Distillation and Multi-task Learning

Leveraging A Medical Knowledge Graph into Large Language Models for Diagnosis Prediction

Label-based Topic Modeling to Enhance Medical Triage for Medical Triage Robots