Abstract:Advancements in natural language processing have revolutionized the way we can interact with digital information systems, such as databases, making them more accessible. However, challenges persist, especially when accuracy is critical, as in the biomedical domain. A key issue is the hallucination problem, where models generate information unsupported by the underlying data, potentially leading to dangerous misinformation. This paper presents a novel approach designed to bridge this gap by combining Large Language Models (LLM) and Knowledge Graphs (KG) to improve the accuracy and reliability of question-answering systems, on the example of a biomedical KG. Built on the LangChain framework, our method incorporates a query checker that ensures the syntactical and semantic validity of LLM-generated queries, which are then used to extract information from a Knowledge Graph, substantially reducing errors like hallucinations. We evaluated the overall performance using a new benchmark dataset of 50 biomedical questions, testing several LLMs, including GPT-4 Turbo and llama3:70b. Our results indicate that while GPT-4 Turbo outperforms other models in generating accurate queries, open-source models like llama3:70b show promise with appropriate prompt engineering. To make this approach accessible, a user-friendly web-based interface has been developed, allowing users to input natural language queries, view generated and corrected Cypher queries, and verify the resulting paths for accuracy. Overall, this hybrid approach effectively addresses common issues such as data gaps and hallucinations, offering a reliable and intuitive solution for question answering systems. The source code for generating the results of this paper and for the user-interface can be found in our Git repository: <a class="link-external link-https" href="https://git.zib.de/lpusch/cyphergenkg-gui" rel="external noopener nofollow">this https URL</a>

Augmented and Programmatically Optimized LLM Prompts Reduce Chemical Hallucinations

Investigating the Role of Prompting and External Tools in Hallucination Rates of Large Language Models

Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning

Beyond Fine-Tuning: Effective Strategies for Mitigating Hallucinations in Large Language Models for Data Analytics

DecoPrompt : Decoding Prompts Reduces Hallucinations when Large Language Models Meet False Premises

ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation

Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Question Answering

Minimizing Factual Inconsistency and Hallucination in Large Language Models

Banishing LLM Hallucinations Requires Rethinking Generalization

Teaching Language Models to Hallucinate Less with Synthetic Tasks

Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models

Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques

Supervisory Prompt Training

Alleviating Hallucinations of Large Language Models through Induced Hallucinations

Towards Mitigating Hallucination in Large Language Models via Self-Reflection

Large Language Model Prompting Techniques for Advancement in Clinical Medicine

Mitigating Hallucinations in Large Language Models: A Comparative Study of RAG-enhanced vs. Human-Generated Medical Templates

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models

Feedback-aligned Mixed LLMs for Machine Language-Molecule Translation

ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models