ColaCare: Enhancing Electronic Health Record Modeling through Large Language Model-Driven Multi-Agent Collaboration

Zixiang Wang,Yinghao Zhu,Huiya Zhao,Xiaochen Zheng,Tianlong Wang,Wen Tang,Yasha Wang,Chengwei Pan,Ewen M. Harrison,Junyi Gao,Liantao Ma
2024-10-03
Abstract:We introduce ColaCare, a framework that enhances Electronic Health Record (EHR) modeling through multi-agent collaboration driven by Large Language Models (LLMs). Our approach seamlessly integrates domain-specific expert models with LLMs to bridge the gap between structured EHR data and text-based reasoning. Inspired by clinical consultations, ColaCare employs two types of agents: DoctorAgent and MetaAgent, which collaboratively analyze patient data. Expert models process and generate predictions from numerical EHR data, while LLM agents produce reasoning references and decision-making reports within the collaborative consultation framework. We additionally incorporate the Merck Manual of Diagnosis and Therapy (MSD) medical guideline within a retrieval-augmented generation (RAG) module for authoritative evidence support. Extensive experiments conducted on four distinct EHR datasets demonstrate ColaCare's superior performance in mortality prediction tasks, underscoring its potential to revolutionize clinical decision support systems and advance personalized precision medicine. The code, complete prompt templates, more case studies, etc. are publicly available at the anonymous link: <a class="link-external link-https" href="https://colacare.netlify.app" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve several key problems in electronic health record (EHR) modeling: 1. **Limitations of data - driven methods**: - Existing EHR modeling methods are mainly pure data - driven end - to - end methods. These methods are independent of external knowledge and cannot understand the clinical significance of record features, only regarding them as variables without semantic context. - These "black - box" methods have limitations in input data distribution sensitivity and over - fitting, especially when the number and diversity of training samples are limited, which is a common problem in real - world clinical practice. 2. **Insufficient interpretability of models**: - Existing methods have limited interpretability and usually rely on traditional feature importance analysis techniques, such as Attention mechanism, SHAP (SHapley Additive exPlanations) and activation level visualization. These techniques can only provide basic interpretability, which is not sufficient for meaningful communication with doctors. 3. **Challenges of knowledge embedding**: - Although some works attempt to embed knowledge through ICD codes and knowledge graphs, these methods face challenges in practical applications because they rely on manually constructed knowledge forms and slow knowledge updates, which are often inconsistent with the latest medical research, clinical reports or updated guidelines, and these factors are crucial for clinical prediction tasks. 4. **Limitations of large language models (LLM) in structured EHR data analysis**: - Although LLM performs well in handling natural language tasks and medical Q&A, its ability in structured EHR data analysis and prediction is limited. In particular, its reasoning ability in few - sample settings still has a significant gap compared with traditional methods. ### Solutions To solve the above problems, the paper proposes the ColaCare framework, which enhances EHR modeling through multi - agent collaboration and large - language - model - (LLM) - driven methods. Specifically, the main contributions of the ColaCare framework include: 1. **Combining external knowledge**: - External knowledge is introduced through the Retrieval - Augmented Generation (RAG) module, enabling the model to be not only EHR data - driven but also able to enrich external knowledge and have self - review capabilities. 2. **Multi - perspective clinical decision - making evidence**: - ColaCare can output multi - perspective clinical decision - making evidence from multiple doctor agents, enhancing model transparency and providing human - understandable decision - making bases, which is helpful for doctors' diagnostic thinking. 3. **Experimental verification**: - Extensive experimental results show that ColaCare performs excellently in the clinical mortality prediction tasks on four different EHR datasets. Case studies highlight the rationality and interpretability of its generated reports, providing a potentially revolutionary solution for the development of clinical decision - support systems and personalized precision medicine. Through these innovations, the ColaCare framework aims to construct a human - interpretable EHR modeling method that can provide individualized prediction reasons and specific patient evidence clues, and has the ability to identify and reflect on potential fatal errors in the prediction results and evidence - finding process.