Abstract:Language model continual learning (CL) has recently attracted significant interest for its ability to adapt large language models (LLMs) to dynamic real-world scenarios without retraining. A major challenge in this domain is catastrophic forgetting, where models lose previously acquired knowledge upon learning new tasks. Existing approaches commonly utilize multiple parameter-efficient fine-tuning (PEFT) blocks to acquire task-specific knowledge, yet these methods are inefficient and fail to leverage potential knowledge transfer across tasks. In this paper, we introduce a novel CL framework for language models, named Knowledge Localization and Fusion (KlF), which boosts knowledge transfer without depending on memory replay. KlF initially segregates the model into 'skill units' based on parameter dependencies, allowing for more precise control. Subsequently, it employs a novel group-wise knowledge localization technique to ascertain the importance distribution of skill units for a new task. By comparing this importance distribution with those from previous tasks, we implement a fine-grained knowledge fusion strategy that retains task-specific knowledge, thereby preventing forgetting, and updates task-shared knowledge, which facilitates bi-directional knowledge transfer. As a result, KlF achieves an optimal balance between retaining prior knowledge and excelling in new tasks. KlF also demonstrates strong generalizability, making it suitable for various base models and adaptable to PEFT methods like LoRA. Furthermore, it offers notable extensibility, supporting enhancements through integration with memory replay techniques. Comprehensive experiments conducted on two CL benchmarks, involving models ranging from 220M to 7B parameters, affirm the effectiveness of KlF and its variants across different settings.

Unlocking Continual Learning Abilities in Language Models

Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters

LLaCA: Multimodal Large Language Continual Assistant

Towards Continual Knowledge Learning of Language Models

Large Language Model Can Continue Evolving From Mistakes

Enhancing Visual Continual Learning with Language-Guided Supervision

Cross-model Control: Improving Multiple Large Language Models in One-time Training

KlF: Knowledge Localization and Fusion for Language Model Continual Learning

Don't Stop Learning: Towards Continual Learning for the CLIP Model

Mamba-CL: Optimizing Selective State Space Model in Null Space for Continual Learning

Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning

Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models

SwitchCIT: Switching for Continual Instruction Tuning of Large Language Models

Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models

Beyond Anti-Forgetting: Multimodal Continual Instruction Tuning with Positive Forward Transfer

An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning

Separable Mixture of Low-Rank Adaptation for Continual Visual Instruction Tuning

Scalable Language Model with Generalized Continual Learning

Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model

MAML-en-LLM: Model Agnostic Meta-Training of LLMs for Improved In-Context Learning

Interactive Continual Learning: Fast and Slow Thinking