Slightly Shift New Classes to Remember Old Classes for Video Class-Incremental Learning

Jian Jiao,Yu Dai,Hefei Mei,Heqian Qiu,Chuanyang Gong,Shiyuan Tang,Xinpeng Hao,Hongliang Li
2024-04-01
Abstract:Recent video class-incremental learning usually excessively pursues the accuracy of the newly seen classes and relies on memory sets to mitigate catastrophic forgetting of the old classes. However, limited storage only allows storing a few representative videos. So we propose SNRO, which slightly shifts the features of new classes to remember old classes. Specifically, SNRO contains Examples Sparse(ES) and Early Break(EB). ES decimates at a lower sample rate to build memory sets and uses interpolation to align those sparse frames in the future. By this, SNRO stores more examples under the same memory consumption and forces the model to focus on low-semantic features which are harder to be forgotten. EB terminates the training at a small epoch, preventing the model from overstretching into the high-semantic space of the current task. Experiments on UCF101, HMDB51, and UESTC-MMEA-CL datasets show that SNRO performs better than other approaches while consuming the same memory consumption.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the issue of catastrophic forgetting in video class incremental learning. Specifically, existing video class incremental learning methods often overly pursue the accuracy of new categories and rely on memory sets to mitigate the forgetting of old categories. However, due to storage limitations, only a small number of representative videos can be stored. Therefore, the authors propose the SNRO method, which remembers old categories by slightly adjusting the features of new categories, thereby improving overall recognition accuracy with the same memory consumption. ### Main Contributions: 1. **Examples Sparse**: By using Sparse Extract technology, more samples are stored within the same storage space, reducing the forgetting of old categories. 2. **Frame Alignment**: By using frame alignment technology to reduce the spatiotemporal information of video representations, the model is prevented from excessively expanding into high semantic space. 3. **Early Break**: Training is terminated early in each incremental task to avoid overfitting to new categories, thereby maintaining the performance of old categories. ### Experimental Results: Experiments were conducted on the UCF101, HMDB51, and UESTC-MMEA-CL datasets. The results show that SNRO outperforms other methods with the same memory consumption, especially in reducing the forgetting of old categories. ### Summary: The paper proposes a new framework, SNRO, which effectively alleviates the problem of catastrophic forgetting and improves the overall performance of video class incremental learning by adjusting the features of new categories and optimizing training strategies.