PNSP: Overcoming Catastrophic Forgetting Using Primary Null Space Projection in Continual Learning

DaiLiang Zhou,YongHong Song
DOI: https://doi.org/10.1016/j.patrec.2024.02.009
IF: 4.757
2024-01-01
Pattern Recognition Letters
Abstract:Continual Learning (CL) plays a crucial role in enhancing learning performance for both new and previous tasks in continuous data streams, thus contributing to the advancement of cognitive computing. However, CL faces a fundamental challenge known as the stability-plasticity quandary. In this research, we present an innovative and effective CL algorithm called Primary Null Space Projection (PNSP) to strike a balance between network plasticity and stability. PNSP consists of three main components. Firstly, it leverages the NSP-LRA algorithm to project the gradient of network parameters from previous tasks into a meticulously designed null space. NSP-LRA harnesses high-dimensional geometric information extracted from the feature covariance matrix through low-rank approximation algorithm to obtain the basis of null space dynamically. This process constructs an innovation null space and ensures the continuous updating of orthonormal bases to accommodate changes in the input data. Secondly, we propose a Consistency-guided Task-specific Feature Learning (CTFL) mechanism to tackle the issue of catastrophic forgetting and facilitate continual learning. CTFL achieves this by aligning feature vectors and maintaining consistent feature learning directions, thereby preventing the loss of previously acquired knowledge. Lastly, we introduce Label Guided Self-Distillation (LGSD), a technique that utilizes true labels to guide the distillation process and incorporates a dynamic temperature mechanism to enhance performance. To evaluate the effectiveness of our proposed method, we conduct experiments on the CIFAR100 and TinyImageNet datasets. The results demonstrate significant improvements over state-of-the-art methods. We have made the implementation code of our approach available for reference.
What problem does this paper attempt to address?