Aligning Large Language Models for Clinical Tasks

Supun Manathunga,Isuru Hettigoda
DOI: https://doi.org/10.48550/arXiv.2309.02884
2023-09-07
Abstract:Large Language Models (LLMs) have demonstrated remarkable adaptability, showcasing their capacity to excel in tasks for which they were not explicitly trained. However, despite their impressive natural language processing (NLP) capabilities, effective alignment of LLMs remains a crucial challenge when deploying them for specific clinical applications. The ability to generate responses with factually accurate content and to engage in non-trivial reasoning steps are crucial for the LLMs to be eligible for applications in clinical medicine. Employing a combination of techniques including instruction-tuning and in-prompt strategies like few-shot and chain-of-thought prompting has significantly enhanced the performance of LLMs. Our proposed alignment strategy for medical question-answering, known as 'expand-guess-refine', offers a parameter and data-efficient solution. A preliminary analysis of this method demonstrated outstanding performance, achieving a score of 70.63% on a subset of questions sourced from the USMLE dataset.
Computation and Language
What problem does this paper attempt to address?
The paper attempts to address the issue of effective alignment of large language models (LLMs) in clinical tasks. Although LLMs have demonstrated excellent natural language processing capabilities, they still face challenges in specific clinical applications, such as generating factually accurate content and performing non-trivial reasoning. The paper proposes a strategy called "expand-guess-refine" to improve the performance of LLMs in medical question-answering tasks. Preliminary analysis shows that this method achieved an accuracy of 70.63% on a subset of the USMLE dataset. Specifically, the paper focuses on the following aspects: 1. **Factual Accuracy**: Ensuring that the content generated by LLMs is factually accurate. 2. **Reasoning Ability**: Enabling LLMs to perform complex multi-step reasoning. 3. **Data and Parameter Efficiency**: Reducing reliance on large-scale annotated data through zero-shot learning and contextual prompting strategies. 4. **Interpretability**: Enhancing the interpretability and verifiability of model outputs by introducing a non-parametric knowledge base. Achieving these goals is crucial for applying LLMs in sensitive medical fields.