Large Language Model-Driven Evaluation of Medical Records Using MedCheckLLM

Marc Cicero Schubert,Wolfgang Wick,Varun Venkataramani
DOI: https://doi.org/10.1101/2023.11.01.23297684
2023-11-04
MedRxiv
Abstract:Large Language Models (LLMs) offer potential in healthcare, especially in the evaluation of medical documents. This research introduces MedCheckLLM, a multi-step framework designed for the systematic assessment of medical records against established evidence-based guidelines, a process termed 'guideline-in-the-loop'. By keeping the guidelines separate from the LLM's training data, this approach emphasizes validity, flexibility, and interpretability. Suggested evidence-based guidelines are externally accessed and fed back into the LLM for a evaluation. The method enables implementation of guideline updates and personalized protocols for specific patient groups without retraining. We applied MedCheckLLM to expert-validated simulated medical reports, focusing on headache diagnoses following International Headache Society guidelines. Findings revealed MedCheckLLM correctly extracted diagnoses, suggested appropriate guidelines, and accurately evaluated 87% of checklist items, with its evaluations aligning significantly with expert opinions. The system not only enhances healthcare quality assurance but also introduces a transparent and efficient means of applying LLMs in clinical settings. Future considerations must address privacy and ethical concerns in actual clinical scenarios.
What problem does this paper attempt to address?