Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics

Vinay V. Ramasesh,Ethan Dyer,Maithra Raghu

DOI: https://doi.org/10.48550/arXiv.2007.07400

2020-07-15

Abstract:A central challenge in developing versatile machine learning systems is catastrophic forgetting: a model trained on tasks in sequence will suffer significant performance drops on earlier tasks. Despite the ubiquity of catastrophic forgetting, there is limited understanding of the underlying process and its causes. In this paper, we address this important knowledge gap, investigating how forgetting affects representations in neural network models. Through representational analysis techniques, we find that deeper layers are disproportionately the source of forgetting. Supporting this, a study of methods to mitigate forgetting illustrates that they act to stabilize deeper layers. These insights enable the development of an analytic argument and empirical picture relating the degree of forgetting to representational similarity between tasks. Consistent with this picture, we observe maximal forgetting occurs for task sequences with intermediate similarity. We perform empirical studies on the standard split CIFAR-10 setup and also introduce a novel CIFAR-100 based task approximating realistic input distribution shift.

Machine Learning,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is the manifestation of **Catastrophic Forgetting** in neural networks and its influence mechanism. Specifically, the paper focuses on the following aspects: 1. **The influence of catastrophic forgetting on the hidden representations of neural networks**: - Through methods such as representational similarity measurement, layer - freezing experiments, and layer - resetting experiments, the paper studies how catastrophic forgetting affects the hidden - layer representations of neural networks. The study finds that the deeper hidden layers are the main sources of catastrophic forgetting. These layers change the most during sequential training, while the lower layers are relatively stable. 2. **Methods for alleviating catastrophic forgetting and their mechanisms**: - The paper analyzes two popular methods for alleviating catastrophic forgetting - Replay Buffers and Elastic Weight Consolidation (EWC). The study finds that these two methods mainly alleviate forgetting by stabilizing the representations of the deeper layers. 3. **The influence of task - semantic similarity on catastrophic forgetting**: - The paper explores how the semantic similarity between tasks affects the degree of catastrophic forgetting. The study finds that when there is intermediate similarity between tasks, forgetting is the most severe. In addition, the paper formalizes this observation through an analytical model and proves its effectiveness through experiments. 4. **Catastrophic forgetting under different task settings**: - The paper conducts experiments on the standard CIFAR - 10 task and the newly introduced CIFAR - 100 distribution - shift task to study the manifestation of catastrophic forgetting under different task settings. The study finds that the deeper hidden layers are the main sources of forgetting in these tasks. Overall, through in - depth research on catastrophic forgetting, this paper reveals its specific manifestation mechanism in neural networks and proposes effective methods for alleviating this problem. These research results not only enhance the understanding of catastrophic forgetting but also provide an important reference direction for future research.

Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics

Overcoming Long-Term Catastrophic Forgetting Through Adversarial Neural Pruning and Synaptic Consolidation

Measuring Catastrophic Forgetting in Neural Networks

Negotiated Representations to Prevent Forgetting in Machine Learning Applications

Catastrophic Forgetting in Deep Learning: A Comprehensive Taxonomy

The Joint Effect of Task Similarity and Overparameterization on Catastrophic Forgetting -- An Analytical Model

Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting

Explaining How Deep Neural Networks Forget by Deep Visualization

Overcoming Catastrophic Forgetting by XAI

An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks

Catastrophic Importance of Catastrophic Forgetting

Catastrophic Forgetting in the Context of Model Updates

Localizing Catastrophic Forgetting in Neural Networks

Understanding Forgetting in Continual Learning with Linear Regression

Behavioral Experiments for Understanding Catastrophic Forgetting

Overcoming catastrophic forgetting in neural networks

Understanding Catastrophic Forgetting and Remembering in Continual Learning with Optimal Relevance Mapping

Dissecting Catastrophic Forgetting in Continual Learning by Deep Visualization

Does the Adam Optimizer Exacerbate Catastrophic Forgetting?

Attention-Based Structural-Plasticity

Knowledge Accumulation in Continually Learned Representations and the Issue of Feature Forgetting