Multimodal Clinical Trial Outcome Prediction with Large Language Models

Wenhao Zheng,Dongsheng Peng,Hongxia Xu,Hongtu Zhu,Tianfan Fu,Huaxiu Yao
2024-02-18
Abstract:The clinical trial is a pivotal and costly process, often spanning multiple years and requiring substantial financial resources. Therefore, the development of clinical trial outcome prediction models aims to exclude drugs likely to fail and holds the potential for significant cost savings. Recent data-driven attempts leverage deep learning methods to integrate multimodal data for predicting clinical trial outcomes. However, these approaches rely on manually designed modal-specific encoders, which limits both the extensibility to adapt new modalities and the ability to discern similar information patterns across different modalities. To address these issues, we propose a multimodal mixture-of-experts (LIFTED) approach for clinical trial outcome prediction. Specifically, LIFTED unifies different modality data by transforming them into natural language descriptions. Then, LIFTED constructs unified noise-resilient encoders to extract information from modal-specific language descriptions. Subsequently, a sparse Mixture-of-Experts framework is employed to further refine the representations, enabling LIFTED to identify similar information patterns across different modalities and extract more consistent representations from those patterns using the same expert model. Finally, a mixture-of-experts module is further employed to dynamically integrate different modality representations for prediction, which gives LIFTED the ability to automatically weigh different modalities and pay more attention to critical information. The experiments demonstrate that LIFTED significantly enhances performance in predicting clinical trial outcomes across all three phases compared to the best baseline, showcasing the effectiveness of our proposed key components.
Computation and Language
What problem does this paper attempt to address?
This paper focuses on the problem of predicting clinical trial results, which is a crucial and costly step in verifying the safety and efficacy of new drugs. Existing methods often rely on deep learning and specific modal encoders to integrate multimodal data for prediction. However, these methods have limitations in adapting to new modalities and identifying similar information patterns across different modalities. To address this issue, the paper proposes a new method called LIFTED (muLti-modal m Ix-of-experts For ou Tcome pr EDiction). LIFTED transforms multimodal data into natural language descriptions using a large-scale language model and then extracts information using a unified noise-robust encoder. Furthermore, it refines these representations using a Sparse Mixture of Experts (SMoE) framework, where the same information patterns are handled by the same expert model, while different patterns are handled by experts with specialized knowledge. Additionally, LIFTED dynamically combines representations from different modalities using a mixture expert module, automatically balancing the importance of different modalities. Experiments demonstrate that LIFTED significantly improves performance in predicting results of clinical trials in all three phases (I, II, and III) compared to existing state-of-the-art baselines, demonstrating the effectiveness of the proposed components. The code for LIFTED is publicly available on GitHub. In summary, this paper addresses the challenge of how to more effectively integrate and utilize multimodal data to improve the accuracy of predicting clinical trial results.