Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering

Jinheon Baek,Alham Fikri Aji,Amir Saffari
2023-06-07
Abstract:Large Language Models (LLMs) are capable of performing zero-shot closed-book question answering tasks, based on their internal knowledge stored in parameters during pre-training. However, such internalized knowledge might be insufficient and incorrect, which could lead LLMs to generate factually wrong answers. Furthermore, fine-tuning LLMs to update their knowledge is expensive. To this end, we propose to augment the knowledge directly in the input of LLMs. Specifically, we first retrieve the relevant facts to the input question from the knowledge graph based on semantic similarities between the question and its associated facts. After that, we prepend the retrieved facts to the input question in the form of the prompt, which is then forwarded to LLMs to generate the answer. Our framework, Knowledge-Augmented language model PromptING (KAPING), requires no model training, thus completely zero-shot. We validate the performance of our KAPING framework on the knowledge graph question answering task, that aims to answer the user's question based on facts over a knowledge graph, on which ours outperforms relevant zero-shot baselines by up to 48% in average, across multiple LLMs of various sizes.
Computation and Language
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the issue of large language models (LLMs) generating factually incorrect answers in zero-shot question answering tasks. Although LLMs can perform zero-shot question answering based on the knowledge stored in their internal parameters, this internal knowledge may be incomplete, inaccurate, or outdated, leading to factually incorrect answers. Additionally, updating the knowledge of LLMs through fine-tuning is costly. Therefore, the authors propose a new method to directly enhance relevant knowledge in the input to improve the accuracy of zero-shot question answering. ### Main Contributions 1. **Knowledge-Enhanced Language Model Prompting Framework**: A new framework (KAPING) is proposed for zero-shot question answering using factual knowledge from knowledge graphs. 2. **Knowledge Retrieval and Enhancement Based on Semantic Similarity**: By matching question entities with entities in the knowledge graph and filtering relevant triples based on semantic similarity, these triples are injected into LLMs as prompts. 3. **Validation and Performance Improvement**: The KAPING framework was validated on knowledge graph question answering tasks, showing significant improvement over related zero-shot baseline methods, with an average performance increase of up to 48%. ### Method Overview 1. **Zero-Shot Question Answering**: Given an input question \( x \), the question answering system returns an answer \( y \). Zero-shot learning does not use any labeled data or model training. 2. **Language Model Prompting**: The question is converted into a prompt string \( x' \) using specific instruction templates, which is then input into LLMs to generate an answer. 3. **Knowledge-Enhanced Language Model Prompting**: - **Knowledge Access**: Extract entities from the question and find corresponding entities and their associated triples in the knowledge graph. - **Knowledge Expression**: Convert triples into text strings to be injected into LLMs. - **Knowledge Injection**: Inject the expressed relevant triples as part of the prompt, along with the question prompt, into LLMs to generate an answer. 4. **Relevant Knowledge Retrieval**: To reduce the impact of irrelevant triples, the most relevant triples are filtered by calculating the similarity between the embeddings of the question and the triples. ### Experimental Setup 1. **Datasets**: WebQuestionsSP and Mintaka knowledge graph question answering datasets were used. 2. **Models**: Different scales of large language models (LLMs) were used, including T5, T0, OPT, and GPT-3. 3. **Baseline Methods**: - **No Knowledge**: Naive LM prompting without knowledge enhancement. - **Random Knowledge**: Randomly selecting triples related to the question entity for enhancement. - **Popular Knowledge**: Selecting triples of the most frequently occurring relations in the KG for enhancement. - **Generated Knowledge**: Extracting knowledge from LLMs for enhancement. 4. **Evaluation Metrics**: - **Generation Accuracy**: Measures whether the generated tokens contain the answer entity. - **Retrieval Performance**: Uses Mean Reciprocal Rank (MRR) and Top-K accuracy to evaluate the contribution of retrieved triples to answer generation. ### Experimental Results and Analysis 1. **Main Results**: The KAPING framework significantly outperformed all baseline methods in zero-shot knowledge graph question answering tasks, especially with smaller LLMs. 2. **Importance of Knowledge**: Experimental results indicate that the internal knowledge of LLMs is insufficient to generate accurate answers, and enhancing with relevant factual knowledge is necessary. 3. **Performance Improvement**: For resource-limited tasks (such as production environments), enhancing knowledge is more effective than increasing model size. ### Conclusion The KAPING framework significantly improves the performance of large language models in zero-shot question answering tasks by directly enhancing relevant knowledge in the input, addressing the issue of generating factually incorrect answers. This method has important implications for practical applications, especially in scenarios requiring accurate answers.