UniOQA: A Unified Framework for Knowledge Graph Question Answering with Large Language Models

Zhuoyang Li,Liran Deng,Hui Liu,Qiaoqiao Liu,Junzhao Du
2024-06-04
Abstract:OwnThink stands as the most extensive Chinese open-domain knowledge graph introduced in recent times. Despite prior attempts in question answering over OwnThink (OQA), existing studies have faced limitations in model representation capabilities, posing challenges in further enhancing overall accuracy in question answering. In this paper, we introduce UniOQA, a unified framework that integrates two complementary parallel workflows. Unlike conventional approaches, UniOQA harnesses large language models (LLMs) for precise question answering and incorporates a direct-answer-prediction process as a cost-effective complement. Initially, to bolster representation capacity, we fine-tune an LLM to translate questions into the Cypher query language (CQL), tackling issues associated with restricted semantic understanding and hallucinations. Subsequently, we introduce the Entity and Relation Replacement algorithm to ensure the executability of the generated CQL. Concurrently, to augment overall accuracy in question answering, we further adapt the Retrieval-Augmented Generation (RAG) process to the knowledge graph. Ultimately, we optimize answer accuracy through a dynamic decision algorithm. Experimental findings illustrate that UniOQA notably advances SpCQL Logical Accuracy to 21.2% and Execution Accuracy to 54.9%, achieving the new state-of-the-art results on this benchmark. Through ablation experiments, we delve into the superior representation capacity of UniOQA and quantify its performance breakthrough.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address two main issues in Knowledge Graph (KG)-based Question Answering (QA): 1. **Insufficient Model Representation Capability**: Existing methods struggle to accurately understand the semantic information, entities, and relationships in natural language questions, especially when dealing with complex questions (e.g., multi-hop questions). This leads to syntactic and semantic errors in the generated query logic forms, thereby affecting execution accuracy. 2. **Limitations of a Single Mode**: Traditional methods of converting text to query languages (such as CQL) are prone to errors and difficult to optimize because they rely on fixed templates or rules, lacking flexibility. To solve these problems, the paper proposes a unified framework called UniOQA, which combines two complementary workflows—Translator and Searcher. Specifically: - **Translator**: By fine-tuning large language models (LLM), it converts natural language questions into executable CQL query language and introduces an entity and relationship replacement algorithm to ensure the generated CQL aligns with the knowledge graph. - **Searcher**: It employs a direct search strategy to retrieve relevant answers from the knowledge graph, supplementing the results of the Translator. Finally, a dynamic decision algorithm is used to integrate the results of the two workflows to improve the overall accuracy of question answering. Experimental results show that UniOQA achieves significant performance improvements on the SpCQL benchmark dataset.