Abstract:Large Language Models (LLMs) serve as repositories of extensive world knowledge, enabling them to perform tasks such as question-answering and fact-checking. However, this knowledge can become obsolete as global contexts change. In this paper, we introduce a novel problem in the realm of continual learning: Online Continual Knowledge Learning (OCKL). This problem formulation aims to manage the dynamic nature of world knowledge in LMs under real-time constraints. We propose a new benchmark and evaluation metric designed to measure both the rate of new knowledge acquisition and the retention of previously learned knowledge. Our empirical evaluation, conducted using a variety of state-of-the-art methods, establishes robust base-lines for OCKL. Our results reveal that existing continual learning approaches are unfortunately insufficient for tackling the unique challenges posed by OCKL. We identify key factors that influence the trade-off between knowledge acquisition and retention, thereby advancing our understanding of how to train LMs in a continually evolving environment.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the insufficient adaptability of large - language models (LLMs) in the face of rapidly changing global knowledge. Specifically, although existing large - language models such as LLaMA2 and GPT - 3 perform well in knowledge - intensive language tasks, their knowledge bases are difficult to update in real - time once established, resulting in the possibility of providing outdated or incorrect answers when dealing with time - sensitive information. For example, these models may have problems when dealing with facts that change over time, such as the British prime minister. To address this challenge, the paper proposes the Online Continual Knowledge Learning (OCKL) framework. Compared with traditional Continual Knowledge Learning (CKL), OCKL places more emphasis on immediate and continuous internal knowledge updates within a very short time (from a few days to a few seconds). This requirement for immediate updates demands that the model be able to complete the update in a single pass through the data (epoch = 1) to cope with high - speed and large - volume incoming data streams. To evaluate and optimize the model performance under the OCKL framework, the paper introduces two new evaluation metrics: the Knowledge Acquisition Rate (KAR) and the Knowledge Gap (KG). KAR is used to measure the speed at which the model acquires new knowledge per unit time, while KG quantifies the difference between the model's internal knowledge and the external world knowledge through vector representation and distance measurement, thereby evaluating the model's knowledge retention and forgetting. In addition, the paper also explores the applicability of existing continuous learning methods in OCKL and analyzes the impact of different model architectures and training methods on knowledge "forgetting" through experiments. The research results show that existing continuous learning methods still have deficiencies in dealing with the unique challenges of OCKL, especially the need to find a better balance between knowledge acquisition and retention.

Online Continual Knowledge Learning for Language Models

Towards Continual Knowledge Learning of Language Models

Continual Learning for Large Language Models: A Survey

Continual Learning of Large Language Models: A Comprehensive Survey

Towards Lifelong Learning of Large Language Models: A Survey

Supervised Knowledge Makes Large Language Models Better In-context Learners

Towards Practical Tool Usage for Continually Learning LLMs

Large Language Models with Controllable Working Memory

KoLA: Carefully Benchmarking World Knowledge of Large Language Models

Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning

Large Language Model Can Continue Evolving From Mistakes

Orthogonal Subspace Learning for Language Model Continual Learning

Interactive Continual Learning: Fast and Slow Thinking

Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models

Online Training of Large Language Models: Learn while chatting

CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models

Scalable Language Model with Generalized Continual Learning

From Static to Dynamic: A Continual Learning Framework for Large Language Models

Recent Advances of Foundation Language Models-based Continual Learning: A Survey

Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models

Continual Post-Training of Language Models