Abstract:Modern deep learning methods have achieved great success in machine learning and computer vision fields by learning a set of pre-defined datasets. Howerver, these methods perform unsatisfactorily when applied into real-world situations. The reason of this phenomenon is that learning new tasks leads the trained model quickly forget the knowledge of old tasks, which is referred to as catastrophic forgetting. Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks and ignore the problem existing in embedding networks, which are the basic networks for image retrieval, face recognition, zero-shot learning, etc. Different from traditional incremental classification networks, the semantic gap between the embedding spaces of two adjacent tasks is the main challenge for embedding networks under incremental learning setting. Thus, we propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI), which leverages zero-shot translation to estimate and compensate the semantic gap without any exemplars. Then, we try to learn a unified representation for two adjacent tasks in sequential learning process, which captures the relationships of previous classes and current classes precisely. In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks. We conduct extensive experiments on CUB-200-2011 and CIFAR100, and the experiment results prove the effectiveness of our method. The code of our method has been released in https://github.com/Drkun/ZSTCI.

Momentum-based Weight Interpolation of Strong Zero-Shot Models for Continual Learning

Continual Learning with Weight Interpolation

Training Deep Neural Networks with Adaptive Momentum Inspired by the Quadratic Optimization

Accelerated Learning for Restricted Boltzmann Machine with a Novel Momentum Algorithm

Weighted Ensemble Models Are Strong Continual Learners

Losing momentum in continuous-time stochastic optimisation

Momentum Benefits Non-iid Federated Learning Simply and Provably

Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters

Improving Deep Neural Networks' Training for Image Classification with Nonlinear Conjugate Gradient-Style Adaptive Momentum.

Robust Fine-tuning of Zero-shot Models via Variance Reduction

ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-Order Optimization

A Mean Field Ansatz for Zero-Shot Weight Transfer

Stochastic Gradient Descent with Nonlinear Conjugate Gradient-Style Adaptive Momentum

Boosting Adversarial Transferability Through Enhanced Momentum

Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

Incremental Embedding Learning Via Zero-Shot Translation

Rectification-based Knowledge Retention for Continual Learning

Learn and Consolidate: Continual Adaptation for Zero-Shot and Multilingual Neural Machine Translation.

CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics with Simulated Annealing

Learning Joint Feature Adaptation for Zero-Shot Recognition

Meta-Learned Attribute Self-Interaction Network for Continual and Generalized Zero-Shot Learning