Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting

Xilai Li,Yingbo Zhou,Tianfu Wu,Richard Socher,Caiming Xiong
DOI: https://doi.org/10.48550/arXiv.1904.00310
2019-05-22
Abstract:Addressing catastrophic forgetting is one of the key challenges in continual learning where machine learning systems are trained with sequential or streaming tasks. Despite recent remarkable progress in state-of-the-art deep learning, deep neural networks (DNNs) are still plagued with the catastrophic forgetting problem. This paper presents a conceptually simple yet general and effective framework for handling catastrophic forgetting in continual learning with DNNs. The proposed method consists of two components: a neural structure optimization component and a parameter learning and/or fine-tuning component. By separating the explicit neural structure learning and the parameter estimation, not only is the proposed method capable of evolving neural structures in an intuitively meaningful way, but also shows strong capabilities of alleviating catastrophic forgetting in experiments. Furthermore, the proposed method outperforms all other baselines on the permuted MNIST dataset, the split CIFAR100 dataset and the Visual Domain Decathlon dataset in continual learning setting.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to overcome catastrophic forgetting in continual learning. In the continual learning scenario, machine - learning systems need to handle sequential or streaming tasks, and current deep neural networks (DNNs) often "forget" previously learned tasks when facing new tasks, a phenomenon known as catastrophic forgetting. The paper proposes a conceptually simple yet general and effective framework to deal with this problem. By separating explicit neural - structure learning and parameter estimation, it can not only evolve the neural structure in an intuitive and meaningful way but also significantly alleviate the catastrophic forgetting phenomenon in experiments. Specifically, the paper proposes a framework named "Learn to Grow", which contains two main components: 1. **Neural - structure Optimization Component**: It is responsible for searching for the best neural - network structure for each continuous task, considering multiple options such as reusing parameters of the previous layer, introducing new parameters, and adapting existing layers. 2. **Parameter Learning and/or Fine - tuning Component**: After determining the network structure, it is responsible for estimating model parameters and fine - tuning the old parameters. Through this method, the paper aims to achieve better task performance while avoiding forgetting previous tasks, thus outperforming other baseline methods on multiple benchmark datasets.