Evaluating the External and Parametric Knowledge Fusion of Large Language Models

Hao Zhang,Yuyang Zhang,Xiaoguang Li,Wenxuan Shi,Haonan Xu,Huanshuo Liu,Yasheng Wang,Lifeng Shang,Qun Liu,Yong Liu,Ruiming Tang
2024-05-29
Abstract:Integrating external knowledge into large language models (LLMs) presents a promising solution to overcome the limitations imposed by their antiquated and static parametric memory. Prior studies, however, have tended to over-reliance on external knowledge, underestimating the valuable contributions of an LLMs' intrinsic parametric knowledge. The efficacy of LLMs in blending external and parametric knowledge remains largely unexplored, especially in cases where external knowledge is incomplete and necessitates supplementation by their parametric knowledge. We propose to deconstruct knowledge fusion into four distinct scenarios, offering the first thorough investigation of LLM behavior across each. We develop a systematic pipeline for data construction and knowledge infusion to simulate these fusion scenarios, facilitating a series of controlled experiments. Our investigation reveals that enhancing parametric knowledge within LLMs can significantly bolster their capability for knowledge integration. Nonetheless, we identify persistent challenges in memorizing and eliciting parametric knowledge, and determining parametric knowledge boundaries. Our findings aim to steer future explorations on harmonizing external and parametric knowledge within LLMs.
Computation and Language,Artificial Intelligence,Information Retrieval
What problem does this paper attempt to address?
The paper attempts to address the challenges faced by large language models (LLMs) in integrating external knowledge with parameterized knowledge. Specifically: 1. **Effectiveness of Integrating External and Parameterized Knowledge**: The paper points out that existing research tends to overly rely on external knowledge, neglecting the value of the internal parameterized knowledge of LLMs. Therefore, the paper aims to explore in depth how LLMs can integrate these two types of knowledge under different conditions, especially when external knowledge is incomplete or irrelevant. 2. **Definition and Evaluation of Knowledge Integration Scenarios**: The paper defines four different knowledge integration scenarios and constructs datasets through systematic methods to standardize the parameterized knowledge in different LLMs, thereby achieving fair and model-independent evaluation. 3. **Challenges in Memorizing and Extracting Parameterized Knowledge**: Although integrating external and parameterized knowledge can significantly enhance the capabilities of LLMs, the paper also reveals ongoing challenges in memorizing, extracting parameterized knowledge, and determining its boundaries. In summary, the goal of the paper is to advance future research on how to better integrate external knowledge and parameterized knowledge in LLMs through systematic experimental design and evaluation methods.