Towards Holistic Disease Risk Prediction using Small Language Models

Liv Björkdahl,Oskar Pauli,Johan Östman,Chiara Ceccobello,Sara Lundell,Magnus Kjellberg
2024-08-13
Abstract:Data in the healthcare domain arise from a variety of sources and modalities, such as x-ray images, continuous measurements, and clinical notes. Medical practitioners integrate these diverse data types daily to make informed and accurate decisions. With recent advancements in language models capable of handling multimodal data, it is a logical progression to apply these models to the healthcare sector. In this work, we introduce a framework that connects small language models to multiple data sources, aiming to predict the risk of various diseases simultaneously. Our experiments encompass 12 different tasks within a multitask learning setup. Although our approach does not surpass state-of-the-art methods specialized for single tasks, it demonstrates competitive performance and underscores the potential of small language models for multimodal reasoning in healthcare.
Machine Learning
What problem does this paper attempt to address?
The paper aims to address the following issues: 1. **Multimodal Disease Risk Prediction**: A framework is proposed that utilizes Small Language Models (SLMs) to handle various data sources (such as time series data, text, and image data) to simultaneously predict the risk of multiple diseases. This approach can process different modalities of data and is designed to address the class imbalance problem in medical data. 2. **Multi-task Learning**: Learning multiple tasks through a single model, including length of hospital stay prediction, 48-hour mortality prediction, and 10 diagnostic tasks related to chest pathologies (fracture, lung lesion, cardiomediastinal enlargement, consolidation, pneumonia, atelectasis, lung opacity, pneumothorax, edema, and pericardial effusion). 3. **Model Generalization Ability**: Demonstrating how a unified model can exhibit good generalization ability across different modalities, data sources, and tasks. Compared to models specifically trained for single tasks, this approach has certain advantages. 4. **Model Scalability**: Providing a scalable method that can handle data of any modality, and the code has been open-sourced to facilitate reproduction and improvement by other researchers. The main contribution of the paper is to demonstrate how Small Language Models (such as Gemma-2B and Phi-3-mini-4k) can be used for multimodal input processing and to predict multiple health-related tasks within a unified framework. Although this method does not surpass the state-of-the-art single-task models in some tasks, its performance is competitive and showcases the potential of Small Language Models in the medical field.