Introducing Common Null Space of Gradients for Gradient Projection Methods in Continual Learning

Chengyi Yang,Mingda Dong,Xiaoyue Zhang,Jiayin Qi,Aimin Zhou
DOI: https://doi.org/10.1145/3664647.3680605
2024-01-01
Abstract:Continual learning aims to learn new knowledge from a sequence of tasks without forgetting. Recent studies have found that projecting gradients onto the orthogonal direction of task-specific features is effective. However, these methods mainly focus on mitigating catastrophic forgetting by adopting old features to construct projection spaces, neglecting the potential to enhance plasticity and the valuable information contained in previous gradients. To enhance plasticity and effectively utilize the gradients from old tasks, we propose Gradient Projection in Common Null Space (GPCNS), which projects current gradients into the common null space of final gradients under all preceding tasks. Moreover, to integrate both feature and gradient information, we propose a collaborative framework that allows GPCNS to be utilized in conjunction with existing gradient projection methods as a plug-and-play extension that provides gradient information and better plasticity. Experimental evaluations conducted on three benchmarks demonstrate that GPCNS exhibits superior plasticity compared to conventional gradient projection methods. More importantly, GPCNS can effectively improve the backward transfer and average accuracy for existing gradient projection methods when applied as a plugin, which outperforms all the gradient projection methods without increasing learnable parameters and customized objective functions. The code is available at https://github.com/Hifipsysta/GPCNS.
What problem does this paper attempt to address?