Two-phase Framework Clinical Question-Answering; A case-study of Autocorrection for Guideline-concordance

Amara Tariq,Nathan Yu,Bhavik Patel,Imon Banerjee
DOI: https://doi.org/10.1101/2024.11.04.24316718
2024-11-05
Abstract:Use of large language models for generative tasks in critical domains like medicine is fraught with challenges like hallucination. In the domain of medicine, hallucination may take a unique shape where the LLM-generated language is not inaccurate but the suggested treatment or medication has now been discontinued in a specific context. Reinforcement learning based solutions for building reliable LLM-based frameworks are limited by the fact that the reinforcement is typically focused on only identifying the mistake; correcting the mistake is left up to the primary LLM. We propose an innovative solution where a two-phase question answering framework composed of two LLMs is designed such that one LLM learns to generate answers while the other learns to correct any mistakes in the answer generated by the first model. We experimented with the particular domain of prostate cancer and LLMs designed for various domains and showed that domain-specific LLMs outperform generic or wide-domain LLMs.
What problem does this paper attempt to address?