Hyper-feature aggregation and relaxed distillation for class incremental learning

Ran Wu,Huanyu Liu,Zongcheng Yue,Jun-Bao Li,Chiu-Wing Sham
DOI: https://doi.org/10.1016/j.patcog.2024.110440
IF: 8
2024-03-28
Pattern Recognition
Abstract:Although neural networks have been used extensively in pattern recognition scenarios, the pre-acquisition of datasets is still challenging. In most pattern recognition areas, preparing a training dataset that covers all data domains is difficult. Incremental learning was proposed to update neural networks in an online manner, but the catastrophic forgetting issue still needs to be studied. Class-incremental learning is one of the most challenging incremental learning contexts; it trains a unified model to classify all incrementally arriving classes learned thus far equally. Prior studies on class-incremental learning favor model stability over plasticity to realize old knowledge reservation and prevent catastrophic forgetting. Consequently, the model's plasticity is omitted, leading to difficult generalization on new data. We propose a novel distillation-based method named Hyper-feature Aggregation and Relaxed Distillation (HARD) to realize balanced optimization of old and new knowledge. The aggregation of features is proposed to capture the global semantics while maintaining the diversity of the feature distribution after promoting representations of exemplars to higher dimensions. The proposed algorithm also introduces a relaxed restriction in the hyper-feature space to conditions the hyper-feature space through a normalized comparison of the relation matrices. Following generalization on more classes, the model is encouraged to rebuild the feature distribution when meeting new classes and to fine-tune the feature space to realize more distinct interclass boundaries. Extensive experiments were conducted on two benchmark datasets, and consistent improvements under diverse experimental settings demonstrated the effectiveness of the proposed approach.
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?