Rehearsal-free Continual Language Learning via Efficient Parameter Isolation

Ling Wang,Tao Ji,Yuanbin Wu,Zhicheng Wang,Wenqiu Zeng,Xiaoling Wang,Yufang Liu,Zhencong Han,Ye Chao,Xu Shao,Congcong Jiang
DOI: https://doi.org/10.18653/v1/2023.acl-long.612
Abstract:We study the problem of defying catastrophic forgetting when learning a series of language processing tasks. Compared with previous methods, we emphasize the importance of not caching history tasks’ data, which makes the problem more challenging. Our proposed method applies the parameter isolation strategy. For each task, it allocates a small portion of private parameters and learns them with a shared pre-trained model. To load correct parameters at testing time, we introduce a simple yet effective non-parametric method. Experiments on continual language learning benchmarks show that our method is significantly better than all existing no-data-cache methods, and is comparable (or even better) than those using historical data.
Computer Science
What problem does this paper attempt to address?