Online Continual Knowledge Learning for Language Models

Yuhao Wu,Tongjun Shi,Karthick Sharma,Chun Wei Seah,Shuhao Zhang
2023-11-16
Abstract:Large Language Models (LLMs) serve as repositories of extensive world knowledge, enabling them to perform tasks such as question-answering and fact-checking. However, this knowledge can become obsolete as global contexts change. In this paper, we introduce a novel problem in the realm of continual learning: Online Continual Knowledge Learning (OCKL). This problem formulation aims to manage the dynamic nature of world knowledge in LMs under real-time constraints. We propose a new benchmark and evaluation metric designed to measure both the rate of new knowledge acquisition and the retention of previously learned knowledge. Our empirical evaluation, conducted using a variety of state-of-the-art methods, establishes robust base-lines for OCKL. Our results reveal that existing continual learning approaches are unfortunately insufficient for tackling the unique challenges posed by OCKL. We identify key factors that influence the trade-off between knowledge acquisition and retention, thereby advancing our understanding of how to train LMs in a continually evolving environment.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the insufficient adaptability of large - language models (LLMs) in the face of rapidly changing global knowledge. Specifically, although existing large - language models such as LLaMA2 and GPT - 3 perform well in knowledge - intensive language tasks, their knowledge bases are difficult to update in real - time once established, resulting in the possibility of providing outdated or incorrect answers when dealing with time - sensitive information. For example, these models may have problems when dealing with facts that change over time, such as the British prime minister. To address this challenge, the paper proposes the Online Continual Knowledge Learning (OCKL) framework. Compared with traditional Continual Knowledge Learning (CKL), OCKL places more emphasis on immediate and continuous internal knowledge updates within a very short time (from a few days to a few seconds). This requirement for immediate updates demands that the model be able to complete the update in a single pass through the data (epoch = 1) to cope with high - speed and large - volume incoming data streams. To evaluate and optimize the model performance under the OCKL framework, the paper introduces two new evaluation metrics: the Knowledge Acquisition Rate (KAR) and the Knowledge Gap (KG). KAR is used to measure the speed at which the model acquires new knowledge per unit time, while KG quantifies the difference between the model's internal knowledge and the external world knowledge through vector representation and distance measurement, thereby evaluating the model's knowledge retention and forgetting. In addition, the paper also explores the applicability of existing continuous learning methods in OCKL and analyzes the impact of different model architectures and training methods on knowledge "forgetting" through experiments. The research results show that existing continuous learning methods still have deficiencies in dealing with the unique challenges of OCKL, especially the need to find a better balance between knowledge acquisition and retention.