Give Us the Facts: Enhancing Large Language Models with Knowledge Graphs for Fact-aware Language Modeling

Linyao Yang,Hongyang Chen,Zhao Li,Xiao Ding,Xindong Wu
2024-01-30
Abstract:Recently, ChatGPT, a representative large language model (LLM), has gained considerable attention due to its powerful emergent abilities. Some researchers suggest that LLMs could potentially replace structured knowledge bases like knowledge graphs (KGs) and function as parameterized knowledge bases. However, while LLMs are proficient at learning probabilistic language patterns based on large corpus and engaging in conversations with humans, they, like previous smaller pre-trained language models (PLMs), still have difficulty in recalling facts while generating knowledge-grounded contents. To overcome these limitations, researchers have proposed enhancing data-driven PLMs with knowledge-based KGs to incorporate explicit factual knowledge into PLMs, thus improving their performance to generate texts requiring factual knowledge and providing more informed responses to user queries. This paper reviews the studies on enhancing PLMs with KGs, detailing existing knowledge graph enhanced pre-trained language models (KGPLMs) as well as their applications. Inspired by existing studies on KGPLM, this paper proposes to enhance LLMs with KGs by developing knowledge graph-enhanced large language models (KGLLMs). KGLLM provides a solution to enhance LLMs' factual reasoning ability, opening up new avenues for LLM research.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily explores the limitations of large language models (LLMs) in generating factually correct text and proposes a method to enhance LLMs through knowledge graphs (KGs) to improve their factual reasoning capabilities. Specifically: 1. **Research Background**: With the development of big data and high-speed computing technologies, pre-trained language models (PLMs) such as BERT and GPT have been widely applied. Recent studies have shown that when the scale of model parameters reaches a certain level, LLMs exhibit some surprising new capabilities. However, these models still face difficulties in generating fact-based content. 2. **Existing Problems**: Although LLMs can store a large amount of information, they struggle to effectively recall and apply this knowledge during the generation process, especially when precise factual support is needed. This leads to inaccuracies or lack of substantiation in the text generated by LLMs. 3. **Solution**: To overcome this limitation, researchers have proposed the idea of combining KGs with LLMs. KGs, as a structured knowledge base, can provide explicit factual information, thereby helping LLMs better understand and utilize relevant knowledge. By developing knowledge graph-enhanced large language models (KGLLMs), the performance of LLMs in generating content that requires factual support can be improved. 4. **Contribution**: This paper systematically reviews existing KGPLMs research and proposes methods to further enhance LLMs, namely using KGs to improve the factual reasoning capabilities of LLMs. Additionally, the paper discusses the complementary relationship between LLMs and KGs and their future development directions.