Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions

Guangzhi Xiong,Qiao Jin,Xiao Wang,Minjia Zhang,Zhiyong Lu,Aidong Zhang

2024-10-11

Abstract:The emergent abilities of large language models (LLMs) have demonstrated great potential in solving medical questions. They can possess considerable medical knowledge, but may still hallucinate and are inflexible in the knowledge updates. While Retrieval-Augmented Generation (RAG) has been proposed to enhance the medical question-answering capabilities of LLMs with external knowledge bases, it may still fail in complex cases where multiple rounds of information-seeking are required. To address such an issue, we propose iterative RAG for medicine (i-MedRAG), where LLMs can iteratively ask follow-up queries based on previous information-seeking attempts. In each iteration of i-MedRAG, the follow-up queries will be answered by a conventional RAG system and they will be further used to guide the query generation in the next iteration. Our experiments show the improved performance of various LLMs brought by i-MedRAG compared with conventional RAG on complex questions from clinical vignettes in the United States Medical Licensing Examination (USMLE), as well as various knowledge tests in the Massive Multitask Language Understanding (MMLU) dataset. Notably, our zero-shot i-MedRAG outperforms all existing prompt engineering and fine-tuning methods on GPT-3.5, achieving an accuracy of 69.68% on the MedQA dataset. In addition, we characterize the scaling properties of i-MedRAG with different iterations of follow-up queries and different numbers of queries per iteration. Our case studies show that i-MedRAG can flexibly ask follow-up queries to form reasoning chains, providing an in-depth analysis of medical questions. To the best of our knowledge, this is the first-of-its-kind study on incorporating follow-up queries into medical RAG. The implementation of i-MedRAG is available at <a class="link-external link-https" href="https://github.com/Teddy-XiongGZ/MedRAG" rel="external noopener nofollow">this https URL</a>.

Computation and Language,Artificial Intelligence

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to address the complex issues encountered by large language models (LLMs) in medical question-answering tasks. Although LLMs possess extensive medical knowledge, they may generate inaccurate content (i.e., "hallucinations") and are not sufficiently flexible in terms of knowledge updates. Existing retrieval-augmented generation (RAG) methods, while enhancing LLMs' medical question-answering capabilities through external knowledge bases, still perform poorly in complex scenarios requiring multi-turn information retrieval. To tackle this challenge, the authors propose the application of iterative RAG in medicine (i-MedRAG). i-MedRAG allows LLMs to iteratively generate subsequent queries based on previous retrieval results, thereby progressively acquiring more relevant information with each iteration. This approach excels in handling complex clinical reasoning problems, especially those requiring multi-step reasoning. Specifically, i-MedRAG improves in the following aspects: 1. **Multi-turn Information Retrieval**: By generating and answering subsequent queries through multiple iterations, i-MedRAG can gradually collect and integrate relevant information, thereby better solving complex problems. 2. **Flexibility**: i-MedRAG can dynamically generate new queries based on existing information, enhancing the system's flexibility and adaptability. 3. **Performance Improvement**: Experimental results show that i-MedRAG significantly improves the performance of various LLMs in handling complex medical problems. Notably, in a zero-shot setting, its accuracy on the MedQA dataset reaches 69.68%, surpassing all existing prompt engineering and fine-tuning methods. In summary, i-MedRAG significantly enhances the performance of LLMs in medical question-answering tasks by introducing an iterative subsequent query mechanism, particularly in complex clinical problems requiring multi-step reasoning.

Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions

Benchmarking Retrieval-Augmented Generation for Medicine

Rationale-Guided Retrieval Augmented Generation for Medical Question Answering

MKRAG: Medical Knowledge Retrieval Augmented Generation for Medical Question Answering

Improving medical reasoning through retrieval and self-reflection with retrieval-augmented large language models

Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering

Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation

Tool Calling: Enhancing Medication Consultation via Retrieval-Augmented Large Language Models

Enhancing Large Language Models with Domain-specific Retrieval Augment Generation: A Case Study on Long-form Consumer Health Question Answering in Ophthalmology

JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning and Professional Question Answering Capability

MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models

IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues

BiomedRAG: A Retrieval Augmented Large Language Model for Biomedicine

Enhancing Retrieval and Managing Retrieval: A Four-Module Synergy for Improved Quality and Efficiency in RAG Systems

GastroBot: a Chinese gastrointestinal disease chatbot based on the retrieval-augmented generation

Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation

Development and Testing of Retrieval Augmented Generation in Large Language Models -- A Case Study Report

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting