ALRS: an Attention Loss Function Based on Row-Sparsity for Incremental Learning.

Yingying Xia,Bo Lu,Jianmin Ji
DOI: https://doi.org/10.1109/icccs57501.2023.10151382
2023-01-01
Abstract:Incremental learning has received significant attention, but the problem of catastrophic forgetting remains a major challenge for existing approaches. This issue hinders models from accumulating knowledge over long stretches. To address this problem, we propose a new approach called Attention Loss function based on Row-Sparsity (ALRS) that mines significant patches by simultaneously learning patch weights and logits (class vectors) using the same parameters. We integrate the attention mechanism with the novel loss function to avoid catastrophic forgetting. This innovative approach enables the model to conflate the newly introduced classes with the existing ones, without the need to store any data or models from the previous steps' base classes. To assess its efficacy, we incorporate ALRS into the distillation loss for validation and conduct a thorough evaluation of the approach's performance on three datasets: CIFAR-100, Caltech-101, and CUBS-200-2011. Compared to LWM, which also does not store data, our method achieves an average improvement of more than 8 percentage points with an absolute advantage on CIFAR-100. Additionally, on Caltech-101 and CUBS-200-2011, our new approach provides comparable accuracy to baseline.
What problem does this paper attempt to address?