Learning to Predict Gradients for Semi-Supervised Continual Learning
Yan Luo,Yongkang Wong,Mohan Kankanhalli,Qi Zhao
DOI: https://doi.org/10.1109/tnnls.2024.3361375
IF: 14.255
2024-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:A key challenge for machine intelligence is to learn new visual concepts without forgetting the previously acquired knowledge. Continual learning (CL) is aimed toward addressing this challenge. However, there still exists a gap between CL and human learning. In particular, humans are able to continually learn from the samples associated with known or unknown labels in their daily lives, whereas existing CL and semi-supervised CL (SSCL) methods assume that the training samples are associated with known labels. Specifically, we are interested in two questions: 1) how to utilize unrelated unlabeled data for the SSCL task and 2) how unlabeled data affect learning and catastrophic forgetting in the CL task. To explore these issues, we formulate a new SSCL method, which can be generically applied to existing CL models. Furthermore, we propose a novel gradient learner to learn from labeled data to predict gradients on unlabeled data. In this way, the unlabeled data can fit into the supervised CL framework. We extensively evaluate the proposed method on mainstream CL methods, adversarial CL (ACL), and semi-supervised learning (SSL) tasks. The proposed method achieves state-of-the-art performance on classification accuracy and backward transfer (BWT) in the CL setting while achieving the desired performance on classification accuracy in the SSL setting. This implies that the unlabeled images can enhance the generalizability of CL models on the predictive ability of unseen data and significantly alleviate catastrophic forgetting. The code is available at https://github.com/luoyan407/grad_prediction.git.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture