Abstract:The Long-Tailed Recognition (LTR) problem emerges in the context of learning from highly imbalanced datasets, in which the number of samples among different classes is heavily skewed. LTR methods aim to accurately learn a dataset comprising both a larger Head set and a smaller Tail set. We propose a theorem where under the assumption of strong convexity of the loss function, the weights of a learner trained on the full dataset are within an upper bound of the weights of the same learner trained strictly on the Head. Next, we assert that by treating the learning of the Head and Tail as two separate and sequential steps, Continual Learning (CL) methods can effectively update the weights of the learner to learn the Tail without forgetting the Head. First, we validate our theoretical findings with various experiments on the toy MNIST-LT dataset. We then evaluate the efficacy of several CL strategies on multiple imbalanced variations of two standard LTR benchmarks (CIFAR100-LT and CIFAR10-LT), and show that standard CL methods achieve strong performance gains in comparison to baselines and approach solutions that have been tailor-made for LTR. We also assess the applicability of CL techniques on real-world data by exploring CL on the naturally imbalanced Caltech256 dataset and demonstrate its superiority over state-of-the-art classifiers. Our work not only unifies LTR and CL but also paves the way for leveraging advances in CL methods to tackle the LTR challenge more effectively.

What problem does this paper attempt to address?

### Problems Addressed by the Paper This paper aims to address the issue of Long-Tailed Recognition (LTR). In real-world datasets, there is often a severe imbalance in the number of samples across different categories, where some categories (Head set) have significantly more samples than others (Tail set). This imbalance leads to a significant drop in the performance of deep learning models on the tail categories, despite good performance on the head categories. ### Main Contributions 1. **Theoretical Contributions**: - A theorem is proposed and proven, which states that if the loss function has strong convexity, the distance between the model weights trained on the complete dataset and those trained only on the head dataset is within a certain range. This range is inversely proportional to the imbalance factor of the dataset and directly proportional to the strong convexity parameter of the loss function. 2. **Methodological Innovations**: - Based on the above theorem, a new perspective is proposed, which decomposes the LTR problem into two consecutive tasks: first learning the head categories, then learning the tail categories. By utilizing Continual Learning (CL) methods, the model weights can be effectively updated to learn the tail categories without forgetting the head categories. 3. **Experimental Validation**: - A series of experiments were conducted using four datasets (MNIST-LT, CIFAR100-LT, CIFAR10-LT, and Caltech256) to validate the effectiveness of CL methods in addressing the LTR problem. The results show that standard CL methods can achieve significant performance improvements in long-tailed distribution scenarios, approaching or even surpassing methods specifically designed for LTR. ### Experimental Results 1. **Upper Bound Validation**: - On the MNIST-LT dataset, by varying the imbalance factor (IF) and the strong convexity parameter (µ), the upper bound derived from the theory was validated. The experimental results indicate that as IF or µ increases, the distance between the weights trained on the complete dataset and those trained only on the head dataset decreases, consistent with theoretical expectations. 2. **LTR Benchmark Testing**: - On the CIFAR100-LT and CIFAR10-LT datasets, three common CL strategies (LwF, EWC, and GPM) were applied and compared with existing LTR methods. The results show that CL methods significantly improve performance on the tail categories, although their performance on the head categories may be slightly inferior to some specifically designed LTR methods. 3. **Real-World Data**: - On the naturally imbalanced Caltech256 dataset, the improved EWC method was used for classification tasks. The results demonstrate that CL methods perform excellently in handling real-world long-tailed distribution data, surpassing the current state-of-the-art methods. ### Conclusion This paper not only proposes a theoretical framework that unifies LTR and CL but also experimentally validates the effectiveness of CL methods in addressing the long-tailed distribution problem. These findings provide new insights and methods for leveraging CL techniques to solve LTR issues.

Can Continual Learning Improve Long-Tailed Recognition? Toward a Unified Framework

Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition

Long-Tailed Recognition via Weight Balancing

LTRL: Boosting Long-tail Recognition via Reflective Learning

SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model

Decoupled Contrastive Learning for Long-Tailed Recognition

A dual-branch model with inter- and intra-branch contrastive loss for long-tailed recognition

Deep Long-Tailed Learning: A Survey

Decoupling Representation and Classifier for Long-Tailed Recognition

Balanced Contrastive Learning for Long-Tailed Visual Recognition

Don't Stop Learning: Towards Continual Learning for the CLIP Model

Towards Effective Collaborative Learning in Long-Tailed Recognition

DELTA: Decoupling Long-Tailed Online Continual Learning

Continual Learning of Large Language Models: A Comprehensive Survey

NCL++: Nested Collaborative Learning for long-tailed visual recognition

Open Long-Tailed Recognition In A Dynamic World

Expanding continual few-shot learning benchmarks to include recognition of specific instances

Mutual Exclusive Modulator for Long-Tailed Recognition

Enhanced multi-branch learning for long-tailed image recognition

Multimodal Framework for Long-Tailed Recognition