Abstract:Exemplar-free class-incremental learning (EFCIL) presents a significant challenge as the old class samples are absent for new task learning. Due to the severe imbalance between old and new class samples, the learned classifiers can be easily biased toward the new ones. Moreover, continually updating the feature extractor under EFCIL can compromise the discriminative power of old class features, e.g., leading to less compact and more overlapping distributions across classes. Existing methods mainly focus on handling biased classifier learning. In this work, both cases are considered using the proposed method. Specifically, we first introduce a Distribution-Based Global Classifier (DBGC) to avoid bias factors in existing methods, such as data imbalance and sampling. More importantly, the compromised distributions of old classes are simulated via a simple operation, variance enlarging (VE). Incorporating VE based on DBGC results in a novel classification loss for EFCIL. This loss is proven equivalent to an Adaptive Margin Softmax Cross Entropy (AMarX). The proposed method is thus called Adaptive Margin Global Classifier (AMGC). AMGC is simple yet effective. Extensive experiments show that AMGC achieves superior image classification results on its own under a challenging EFCIL setting. Detailed analysis is also provided for further demonstration.

What problem does this paper attempt to address?

This paper attempts to solve two main problems in Exemplar - Free Class - Incremental Learning (EFCIL): 1. **Classifier Bias**: In EFCIL, due to the lack of old - class samples, the learning of new tasks will cause the classifier to be biased towards new classes. Specifically, the severe imbalance between old - class and new - class samples makes the classifier easily biased towards new classes. 2. **Old - Class Feature Degradation**: Continuously updating the feature extractor will weaken the discriminative ability of old - class features, resulting in the old - class feature distribution becoming less compact and having increased overlap, as shown in Figure 1. ### Solutions To solve the above problems, the authors propose the Adaptive Margin Global Classifier (AMGC), and its main contributions include: - **Introducing the Distribution - Based Global Classifier (DBGC)**: By using the statistical information (mean vector \(\mu\) and covariance matrix \(\Sigma\)) of old and new classes, DBGC aims to alleviate the sampling bias and local optimum problems in existing methods. - **Simulating Old - Class Feature Degradation**: By introducing the Variance Enlarging (VE) technique, the degradation of old - class features is simulated. The specific operation is to increase the values on the diagonal of the old - class covariance matrix, that is: \[ \hat{\Sigma}_k=\Sigma_k + \lambda\Lambda_k \] where \(\Sigma_k\) is the covariance matrix of old - class \(k\), \(\Lambda_k\) is the diagonal matrix of \(\Sigma_k\), and \(\lambda> 0\) is a hyperparameter. - **Proposing the Adaptive Margin Softmax Cross - Entropy Loss (AMarX)**: Combining VE and DBGC, a new classification loss AMarX is derived. This loss can be regarded as a Softmax cross - entropy loss with an adaptive margin, which can adjust the margins of different classes, thereby better handling the problem of old - class feature degradation. The final model AMGC is composed of DBGC and AMarX. The experimental results show that AMGC has achieved state - of - the - art performance on multiple datasets. ### Formula Summary - **DBGC Loss**: \[ L_{DB}(\mu,\Sigma;\theta,\phi)=\frac{1}{K}\sum_{k = 1}^{K}\log\left(\sum_{j = 1}^{K}e^{\omega_j^T\mu_k+\frac{1}{2}\omega_j^T\Sigma_k\omega_j+\delta_j}\right) \] - **AMarX Loss**: \[ L_{o}^{AMarX}=-\frac{1}{O_t}\sum_{k = 1}^{O_t}\log\frac{e^{\omega_k^T\mu_k + b_k - m_k}}{e^{\omega_k^T\mu_k + b_k - m_k}+\sum_{j\neq k}e^{\omega_j^T\mu_k + b_j+\sigma_{j,k}+\beta_{j,k}}} \] where \(m_k=\frac{\lambda}{2}\omega_k^T\Lambda_k\omega_k\). These formulas ensure the effectiveness and robustness of the model when dealing with EFCIL.

Adaptive Margin Global Classifier for Exemplar-Free Class-Incremental Learning

Multi-class AdaBoost ELM

FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning

DiffClass: Diffusion-Based Class Incremental Learning

FSCIL-EACA: Few-Shot Class-Incremental Learning Network Based on Embedding Augmentation and Classifier Adaptation for Image Classification

G-ACIL: Analytic Learning for Exemplar-Free Generalized Class Incremental Learning

Memory Efficient Class-Incremental Learning for Image Classification

Dynamic Residual Classifier for Class Incremental Learning

Task-recency bias strikes back: Adapting covariances in Exemplar-Free Class Incremental Learning

Class-Incremental Exemplar Compression for Class-Incremental Learning

Class-incremental learning with Balanced Embedding Discrimination Maximization

Dual Balanced Class-Incremental Learning With im-Softmax and Angular Rectification

Adaptive maximum margin criterion for image classification

Adaptive Embedding and Distribution Re-margin for Long-Tail Recognition

PASS++: A Dual Bias Reduction Framework for Non-Exemplar Class-Incremental Learning

Elastic Feature Consolidation for Cold Start Exemplar-Free Incremental Learning

Margin-Based Few-Shot Class-Incremental Learning with Class-Level Overfitting Mitigation

Learning optimal inter-class margin adaptively for few-shot class-incremental learning via neural collapse-based meta-learning

Gradient Reweighting: Towards Imbalanced Class-Incremental Learning

Exemplar-Free Class Incremental Learning via Incremental Representation