Dynamic Q&A of Clinical Documents with Large Language Models

Ran Elgedawy,Ioana Danciu,Maria Mahbub,Sudarshan Srinivasan
2024-07-02
Abstract:Electronic health records (EHRs) house crucial patient data in clinical notes. As these notes grow in volume and complexity, manual extraction becomes challenging. This work introduces a natural language interface using large language models (LLMs) for dynamic question-answering on clinical notes. Our chatbot, powered by Langchain and transformer-based LLMs, allows users to query in natural language, receiving relevant answers from clinical notes. Experiments, utilizing various embedding models and advanced LLMs, show Wizard Vicuna's superior accuracy, albeit with high compute demands. Model optimization, including weight quantization, improves latency by approximately 48 times. Promising results indicate potential, yet challenges such as model hallucinations and limited diverse medical case evaluations remain. Addressing these gaps is crucial for unlocking the value in clinical notes and advancing AI-driven clinical decision-making.
Information Retrieval,Artificial Intelligence
What problem does this paper attempt to address?
This paper presents a solution to the challenges of dynamic question answering in clinical documents. With the increasing amount of complex and unstructured clinical notes contained in electronic health records (EHRs), it has become extremely difficult for researchers and clinicians to manually search and extract relevant information. To address this, the paper introduces a natural language dialogue interface based on a large-scale language model (LLMs), which allows users to explore the clinical notes through dynamic question answering. The system utilizes the Langchain framework and a powerful Transformer-based language model to construct a chatbot interface, where users can ask questions in natural language and retrieve answers from the relevant parts of the clinical notes. The paper evaluates the capabilities of different semantic embedding models, such as SentenceTransformers, and large language models in encoding queries and documents for optimizing information retrieval. The experiments show that the "Wizard Vicuna" model, with 1.3 billion parameters, achieves the highest accuracy but requires significant computational resources. To improve inference latency and deployability, the paper incorporates weight quantization techniques, reducing the latency by approximately 48 times. Despite the encouraging results, there are still limitations such as model hallucination and the lack of robust evaluation for diverse medical cases. Future work will focus on addressing these gaps to fully leverage the potential of clinical notes and facilitate clinical decision-making through artificial intelligence (AI). The paper also discusses the limitations of traditional information retrieval methods, such as the lack of semantic and contextual understanding, and highlights the advantages of large language models in comprehending complex natural language texts.