Distilling Large Language Models for Matching Patients to Clinical Trials

Mauro Nievas,Aditya Basu,Yanshan Wang,Hrituraj Singh
2023-12-16
Abstract:The recent success of large language models (LLMs) has paved the way for their adoption in the high-stakes domain of healthcare. Specifically, the application of LLMs in patient-trial matching, which involves assessing patient eligibility against clinical trial's nuanced inclusion and exclusion criteria, has shown promise. Recent research has shown that GPT-3.5, a widely recognized LLM developed by OpenAI, can outperform existing methods with minimal 'variable engineering' by simply comparing clinical trial information against patient summaries. However, there are significant challenges associated with using closed-source proprietary LLMs like GPT-3.5 in practical healthcare applications, such as cost, privacy and reproducibility concerns. To address these issues, this study presents the first systematic examination of the efficacy of both proprietary (GPT-3.5, and GPT-4) and open-source LLMs (LLAMA 7B,13B, and 70B) for the task of patient-trial matching. Employing a multifaceted evaluation framework, we conducted extensive automated and human-centric assessments coupled with a detailed error analysis for each model. To enhance the adaptability of open-source LLMs, we have created a specialized synthetic dataset utilizing GPT-4, enabling effective fine-tuning under constrained data conditions. Our findings reveal that open-source LLMs, when fine-tuned on this limited and synthetic dataset, demonstrate performance parity with their proprietary counterparts. This presents a massive opportunity for their deployment in real-world healthcare applications. To foster further research and applications in this field, we release both the annotated evaluation dataset along with the fine-tuned LLM -- Trial-LLAMA -- for public use.
Artificial Intelligence,Information Retrieval
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges faced when using large - language models (LLMs) in clinical trial matching. Specifically, the paper focuses on the following aspects: 1. **Cost issue**: The cost of using closed - source proprietary LLMs (such as GPT - 3.5 and GPT - 4) is relatively high. Especially in the healthcare field, these models usually need to be run in a central cloud environment, which increases the usage cost. 2. **Privacy issue**: In the healthcare field, when dealing with patient data, strict protection of personal health information (PHI) is required. Using closed - source proprietary LLMs may increase the risk of data leakage because these models are usually not run on local infrastructure. 3. **Reproducibility issue**: Due to their closed and proprietary nature, closed - source proprietary LLMs make it difficult for research results to be reproduced by other researchers or institutions, which affects the transparency and credibility of the research. 4. **Model performance**: Although closed - source proprietary LLMs perform well on many tasks, their complexity and opacity limit their wide application in the healthcare field. Therefore, it is necessary to develop open - source LLMs so that they can reach or even exceed the performance of closed - source models. To address these problems, the paper has carried out the following work: - **System evaluation**: A systematic evaluation of the performance of closed - source proprietary LLMs (such as GPT - 3.5 and GPT - 4) and open - source LLMs (such as LLAMA 7B, 13B and 70B) in the patient - trial matching task was conducted. - **Fine - tuning and synthetic data**: A special synthetic dataset was created, and data generated by GPT - 4 was used to fine - tune open - source LLMs to improve their adaptability and performance under limited data conditions. - **Error analysis**: An error taxonomy was defined, and the error types of each model in the task were analyzed in detail. - **Public resources**: The annotated dataset used for evaluation and the fine - tuned LLM (called Trial - LLAMA) were released for public use. Through these methods, the paper aims to demonstrate the potential of open - source LLMs in the patient - trial matching task and provide low - cost and high - efficiency solutions for practical applications in the healthcare field.