keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM

Chaojie Wang,Yishi Xu,Zhong Peng,Chenxi Zhang,Bo Chen,Xinrun Wang,Lei Feng,Bo An
2023-12-31
Abstract:Large language models (LLMs) have exhibited remarkable performance on various natural language processing (NLP) tasks, especially for question answering. However, in the face of problems beyond the scope of knowledge, these LLMs tend to talk nonsense with a straight face, where the potential solution could be incorporating an Information Retrieval (IR) module and generating response based on these retrieved knowledge. In this paper, we present a novel framework to assist LLMs, such as ChatGPT, to retrieve question-related structured information on the knowledge graph, and demonstrate that Knowledge-based question answering (Keqing) could be a nature Chain-of-Thought (CoT) mentor to guide the LLM to sequentially find the answer entities of a complex question through interpretable logical chains. Specifically, the workflow of Keqing will execute decomposing a complex question according to predefined templates, retrieving candidate entities on knowledge graph, reasoning answers of sub-questions, and finally generating response with reasoning paths, which greatly improves the reliability of LLM's response. The experimental results on KBQA datasets show that Keqing can achieve competitive performance and illustrate the logic of answering each question.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily addresses the issues encountered by large language models (LLMs) in knowledge question-answering tasks and proposes a new solution. Specifically, while existing large language models perform excellently in various natural language processing (NLP) tasks, they often generate incorrect, meaningless, or false information when faced with questions beyond their knowledge scope. This phenomenon is known as "hallucination." To mitigate this problem, the paper introduces a new framework called Keqing. The main contributions of the Keqing framework are as follows: 1. **Question Decomposition**: Decomposing complex questions into multiple sub-questions according to predefined templates, making it easier for LLMs to understand and process each part. 2. **Knowledge Retrieval**: Collecting entity information related to sub-questions by retrieving from a knowledge graph. This method is more precise and interpretable compared to traditional embedding-based retrieval methods. 3. **Candidate Reasoning**: Selecting the correct answer from the retrieved candidate entities to answer the sub-questions. This step leverages the reasoning capabilities of LLMs. 4. **Response Generation**: Finally, generating the final response based on multi-turn question-answer records, improving the reliability of LLMs' responses. Through the above steps, Keqing not only effectively enhances the performance of LLMs in knowledge-intensive tasks but also improves the interpretability and accuracy of their responses. Experimental results show that Keqing performs competitively on knowledge graph question-answering datasets and can clearly demonstrate the logic behind answering each question.