Class Incremental Learning Via Dynamic Regeneration with Task-Adaptive Distillation

Hao Yang,Wei He,Zhenyu Shan,Xiaoxin Fang,Xiong Chen
DOI: https://doi.org/10.1016/j.comcom.2023.12.030
IF: 5.047
2024-01-01
Computer Communications
Abstract:Class Incremental Learning (CIL) is a paradigm that excels in efficiently training models on an expanding set of classes, while preserving performance in classes learned earlier. In the context of smart city development, CIL plays a crucial role in addressing the need for Networking Systems of Artificial Intelligence (NSAI) to continuously adapt to ever-evolving data. In recent years, the incremental network structure widely adopted in CIL suffers from poor knowledge distillation and rapid model parameter growth. In this paper, we introduce a novel strategy named dynamic regeneration with task-adaptive distillation (DRTAD), which dynamically adapts to new tasks, and sustains good performance even as the class set continues to grow. DRTAD adopts a two-stage training strategy: dynamic regeneration and dynamic retention. During dynamic regeneration, DRTAD enhances feature representation by creating a new feature extraction module that extracts features from new classes, while also utilizing features from previously learned classes. Additionally, DRTAD introduces task-adaptive distillation to improve the poor knowledge distillation, further mitigating catastrophic forgetting. During the dynamic retention phase, DRTAD achieves a higher pruning rate through RM operation. Comprehensive experiments on CIFAR-100 and ImageNet-100 datasets demonstrate DRTAD’s superior performance compared to existing CIL methods. Notably, in the CIFAR100-B50 5 steps incremental setting, DRTAD increases the last-phase accuracy from 65.51% to 70.54% (+5.03%), while maintaining fewer parameters (−11.1%). Similarly, in the ImageNet100-B50 10 steps setting, the last-phase accuracy rises from 70.04% to 72.50% (+2.46%). These results indicate DRTAD’s efficacy in mitigating catastrophic forgetting in incremental structure.
What problem does this paper attempt to address?