LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities

Yuqi Zhu,Xiaohan Wang,Jing Chen,Shuofei Qiao,Yixin Ou,Yunzhi Yao,Shumin Deng,Huajun Chen,Ningyu Zhang
2024-08-18
Abstract:This paper presents an exhaustive quantitative and qualitative evaluation of Large Language Models (LLMs) for Knowledge Graph (KG) construction and reasoning. We engage in experiments across eight diverse datasets, focusing on four representative tasks encompassing entity and relation extraction, event extraction, link prediction, and question-answering, thereby thoroughly exploring LLMs' performance in the domain of construction and inference. Empirically, our findings suggest that LLMs, represented by GPT-4, are more suited as inference assistants rather than few-shot information extractors. Specifically, while GPT-4 exhibits good performance in tasks related to KG construction, it excels further in reasoning tasks, surpassing fine-tuned models in certain cases. Moreover, our investigation extends to the potential generalization ability of LLMs for information extraction, leading to the proposition of a Virtual Knowledge Extraction task and the development of the corresponding VINE dataset. Based on these empirical findings, we further propose AutoKG, a multi-agent-based approach employing LLMs and external sources for KG construction and reasoning. We anticipate that this research can provide invaluable insights for future undertakings in the field of knowledge graphs. The code and datasets are in <a class="link-external link-https" href="https://github.com/zjunlp/AutoKG" rel="external noopener nofollow">this https URL</a>.
Computation and Language,Artificial Intelligence,Databases,Information Retrieval,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to evaluate the performance and potential of large - language models (LLMs) in knowledge graph (KG) construction and reasoning. Specifically, through a series of experiments, the researchers aim to answer the following key questions: 1. **Basic capabilities of LLMs in KG tasks**: - Evaluate the performance of LLMs in zero - shot and one - shot settings, especially in tasks such as entity and relation extraction, event extraction, link prediction, and question answering. - Compare the performance differences between different LLMs (such as GPT - 4, ChatGPT, and text - davinci - 003) and fully - supervised small models. 2. **Generalization ability of LLMs**: - Design a virtual knowledge extraction task to evaluate the performance of LLMs when facing unseen knowledge. - Construct a new dataset VINE to test whether LLMs can acquire new knowledge from instructions and effectively perform extraction tasks. 3. **Future opportunities**: - Explore how to use multi - agent systems and external resources to achieve automated KG construction and reasoning (AutoKG). - Propose a multi - agent framework that enables multiple LLMs agents to work collaboratively through iterative dialogue to enhance the ability of KG construction and reasoning. Through these experiments and analyses, the researchers hope to provide valuable insights and promote future research and development in the field of knowledge graphs.