A novel spatial-frequency domain network for zero-shot incremental learning

Jie Ren,Yang Zhao,Weichuan Zhang,Changming Sun
2024-02-11
Abstract:Zero-shot incremental learning aims to enable the model to generalize to new classes without forgetting previously learned classes. However, the semantic gap between old and new sample classes can lead to catastrophic forgetting. Additionally, existing algorithms lack capturing significant information from each sample image domain, impairing models' classification performance. Therefore, this paper proposes a novel Spatial-Frequency Domain Network (SFDNet) which contains a Spatial-Frequency Feature Extraction (SFFE) module and Attention Feature Alignment (AFA) module to improve the Zero-Shot Translation for Class Incremental algorithm. Firstly, SFFE module is designed which contains a dual attention mechanism for obtaining salient spatial-frequency feature information. Secondly, a novel feature fusion module is conducted for obtaining fused spatial-frequency domain features. Thirdly, the Nearest Class Mean classifier is utilized to select the most suitable category. Finally, iteration between tasks is performed using the Zero-Shot Translation model. The proposed SFDNet has the ability to effectively extract spatial-frequency feature representation from input images, improve the accuracy of image classification, and fundamentally alleviate catastrophic forgetting. Extensive experiments on the CUB 200-2011 and CIFAR100 datasets demonstrate that our proposed algorithm outperforms state-of-the-art incremental learning algorithms.
Computer Science
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the catastrophic forgetting problem in zero - shot incremental learning (ZSIL), as well as the deficiencies of existing algorithms in capturing significant information in each sample image domain. Specifically: 1. **Catastrophic Forgetting**: When the model learns data of new classes, it often forgets the data of classes that have been learned before. This phenomenon is called catastrophic forgetting, which will lead to a significant decline in the performance of the model on old tasks. 2. **Semantic Gap**: The semantic gap between new and old sample classes is large, resulting in the model having difficulty maintaining good recognition ability for old classes when dealing with new classes. 3. **Insufficient Feature Extraction**: Existing incremental learning algorithms fail to fully capture the significant information of each sample image, affecting the classification performance of the model. To solve these problems, the paper proposes a new Spatial - Frequency Domain Network (SFDNet), which contains the following modules: - **Spatial - Frequency Feature Extraction (SFFE) Module**: Used to extract spatial and frequency domain features from the input image. - **Attention Feature Alignment (AFA) Module**: By combining the attention mechanisms in the spatial and frequency domains, it enhances the network's attention to the regions of interest and improves the effectiveness of image feature extraction. ### Main Contributions 1. **Proposing SFDNet**: This network can more comprehensively extract image features by introducing the spatial - frequency feature extraction modules (SFE and FFE) and the attention feature alignment module (AFA), thereby improving the accuracy of image classification and alleviating the catastrophic forgetting problem. 2. **Fusing Spatial and Frequency Domain Features**: For the first time, the frequency domain feature extraction network is introduced into zero - shot incremental learning, enabling the network to obtain more abundant image feature information and effectively improve the classification accuracy. ### Method Overview The main components of SFDNet include: - **Spatial Domain Feature Extraction Module (SFE)**: Use ResNet12 as the backbone network to extract the spatial domain features of the image. - **Frequency Domain Feature Extraction Module (FFE)**: Convert the image to the frequency domain through the discrete cosine transform (DCT), extract high - and low - frequency components, and then extract features through ResNet12. - **Spatial - Frequency Domain Attention Feature Alignment Module (AFA)**: Combine the spatial attention mechanism (SENet) and the frequency domain attention mechanism (FcaNet), and align and fuse the features through the cross - alignment mechanism (CADA - VAE). - **Zero - Shot Translation Module**: Used to compensate for the semantic gap between new and old feature embeddings and ensure that the prototypes are updated in the common embedding space. ### Experimental Verification The paper conducted experiments on the CUB - 200 - 2011 and CIFAR - 100 datasets to verify the effectiveness of the proposed method. The experimental results show that SFDNet outperforms the existing state - of - the - art incremental learning algorithms in zero - shot incremental learning tasks. ### Conclusion By introducing the spatial - frequency domain feature extraction and attention mechanisms, SFDNet can more effectively extract image features, significantly improve the classification accuracy and the stability of the model, and fundamentally alleviate the catastrophic forgetting problem.