Building Trustworthy Generative Artificial Intelligence for Diabetes Care and Limb Preservation: A Medical Knowledge Extraction Case

Shayan Mashatian,David G Armstrong,Aaron Ritter,Jeffery Robbins,Shereen Aziz,Ilia Alenabi,Michelle Huo,Taneeka Anand,Kouhyar Tavakolian
DOI: https://doi.org/10.1177/19322968241253568
2024-05-20
Abstract:Background: Large language models (LLMs) offer significant potential in medical information extraction but carry risks of generating incorrect information. This study aims to develop and validate a retriever-augmented generation (RAG) model that provides accurate medical knowledge about diabetes and diabetic foot care to laypersons with an eighth-grade literacy level. Improving health literacy through patient education is paramount to addressing the problem of limb loss in the diabetic population. In addition to affecting patient well-being through improved outcomes, improved physician well-being is an important outcome of a self-management model for patient health education. Methods: We used an RAG architecture and built a question-and-answer artificial intelligence (AI) model to extract knowledge in response to questions pertaining to diabetes and diabetic foot care. We utilized GPT-4 by OpenAI, with Pinecone as a vector database. The NIH National Standards for Diabetes Self-Management Education served as the basis for our knowledge base. The model's outputs were validated through expert review against established guidelines and literature. Fifty-eight keywords were used to select 295 articles and the model was tested against 175 questions across topics. Results: The study demonstrated that with appropriate content volume and few-shot learning prompts, the RAG model achieved 98% accuracy, confirming its capability to offer user-friendly and comprehensible medical information. Conclusion: The RAG model represents a promising tool for delivering reliable medical knowledge to the public which can be used for self-education and self-management for diabetes, highlighting the importance of content validation and innovative prompt engineering in AI applications.
What problem does this paper attempt to address?