Abstract:Large language models (LLMs), such as GPT3.5, GPT4 and LLAMA2 perform surprisingly well and outperform human experts on many tasks. However, in many domain-specific evaluations, these LLMs often suffer from hallucination problems due to insufficient training of relevant corpus. Furthermore, fine-tuning large models may face problems such as the LLMs are not open source or the construction of high-quality domain instruction is difficult. Therefore, structured knowledge databases such as knowledge graph can better provide domain background knowledge for LLMs and make full use of the reasoning and analysis capabilities of LLMs. In some previous works, LLM was called multiple times to determine whether the current triplet was suitable for inclusion in the subgraph when retrieving subgraphs through a question. Especially for the question that require a multi-hop reasoning path, frequent calls to LLM will consume a lot of computing power. Moreover, when choosing the reasoning path, LLM will be called once for each step, and if one of the steps is selected incorrectly, it will lead to the accumulation of errors in the following steps. In this paper, we integrated and optimized a pipeline for selecting reasoning paths from KG based on LLM, which can reduce the dependency on LLM. In addition, we propose a simple and effective subgraph retrieval method based on chain of thought (CoT) and page rank which can returns the paths most likely to contain the answer. We conduct experiments on three datasets: GenMedGPT-5k [14], WebQuestions [2], and CMCQA [21]. Finally, RoK can demonstrate that using fewer LLM calls can achieve the same results as previous SOTAs models.

What problem does this paper attempt to address?

The paper attempts to address the following issues: 1. **Hallucination Problem**: Large Language Models (LLMs) often generate responses that appear reasonable but are actually incorrect when retrieving knowledge in specific domains, especially when dealing with vertical domain issues. In particular, due to the high similarity of domain knowledge in some vertical fields, LLMs are prone to confusing questions or answers. 2. **Reasoning Ability for Complex Tasks**: Although LLMs use a large and rich corpus during the pre-training phase and exhibit good memory capabilities in actual tests, they struggle to achieve satisfactory accuracy when faced with tasks requiring complex reasoning. This is true even when using Chain of Thought (CoT) because it is uncertain whether LLMs have learned the logical relationships between pieces of knowledge. 3. **Lack of Interpretability**: Due to the lack of interpretability in the reasoning process of deep neural networks, LLMs are still considered black-box models. 4. **Efficiency in Selecting Multi-hop Reasoning Paths**: In many previous works, LLMs are frequently invoked to determine whether the current triple should be included in the subgraph when retrieving subgraphs from a knowledge graph. This is particularly resource-intensive for problems requiring multi-hop reasoning paths. Additionally, if an error is made in selecting a reasoning path at any step, it can lead to cumulative errors in subsequent steps. To address these issues, the paper proposes a new paradigm—Reasoning on Efficient Knowledge Paths (RoK). RoK optimizes the process of selecting reasoning paths from the knowledge graph, reducing reliance on LLMs, and introduces a simple and effective subgraph retrieval method based on Chain of Thought and PageRank, which can return the most likely paths containing the answer. Experimental results show that RoK can achieve the same effect as existing state-of-the-art models while reducing the number of LLM invocations.

Reasoning on Efficient Knowledge Paths:Knowledge Graph Guides Large Language Model for Domain Question Answering

Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning

Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph

Think-on-Graph: Deep and Responsible Reasoning of Large Language Model with Knowledge Graph

Retrieval and Reasoning on KGs: Integrate Knowledge Graphs into Large Language Models for Complex Question Answering

Logic Query of Thoughts: Guiding Large Language Models to Answer Complex Logic Queries with Knowledge Graphs

KnowledgeNavigator: Leveraging Large Language Models for Enhanced Reasoning over Knowledge Graph

Clue-Guided Path Exploration: Optimizing Knowledge Graph Retrieval with Large Language Models to Address the Information Black Box Challenge

Debate on Graph: a Flexible and Reliable Reasoning Framework for Large Language Models

Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs

Reasoning on Graphs: Faithful and Interpretable Large Language Model Reasoning

ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph

Direct Evaluation of Chain-of-Thought in Multi-hop Reasoning with Knowledge Graphs

Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge Graphs

KG-GPT: A General Framework for Reasoning on Knowledge Graphs Using Large Language Models

Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation

Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation

Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models

Knowledge Graph-Enhanced Large Language Models via Path Selection

Evaluating and Enhancing Large Language Models for Conversational Reasoning on Knowledge Graphs