Abstract:In the real open world, data tends to follow long-tailed class distributions, motivating the well-studied long-tailed recognition (LTR) problem. Naive training produces models that are biased toward common classes in terms of higher accuracy. The key to addressing LTR is to balance various aspects including data distribution, training losses, and gradients in learning. We explore an orthogonal direction, weight balancing, motivated by the empirical observation that the naively trained classifier has "artificially" larger weights in norm for common classes (because there exists abundant data to train them, unlike the rare classes). We investigate three techniques to balance weights, L2-normalization, weight decay, and MaxNorm. We first point out that L2-normalization "perfectly" balances per-class weights to be unit norm, but such a hard constraint might prevent classes from learning better classifiers. In contrast, weight decay penalizes larger weights more heavily and so learns small balanced weights; the MaxNorm constraint encourages growing small weights within a norm ball but caps all the weights by the radius. Our extensive study shows that both help learn balanced weights and greatly improve the LTR accuracy. Surprisingly, weight decay, although underexplored in LTR, significantly improves over prior work. Therefore, we adopt a two-stage training paradigm and propose a simple approach to LTR: (1) learning features using the cross-entropy loss by tuning weight decay, and (2) learning classifiers using class-balanced loss by tuning weight decay and MaxNorm. Our approach achieves the state-of-the-art accuracy on five standard benchmarks, serving as a future baseline for long-tailed recognition.

Balanced Gradient Penalty Improves Deep Long-Tailed Learning

UniGrad-FS: Unified Gradient Projection with Flatter Sharpness for Continual Learning

Long-Tailed Learning as Multi-Objective Optimization

Feature-Balanced Loss for Long-Tailed Visual Recognition

GREB: Gradient Re-Balanced Loss for Long-Tailed Multi-Lable Classification

Balanced complement loss for long-tailed image classification

The Equalization Losses: Gradient-Driven Training for Long-tailed Object Recognition

Gradient-Aware Logit Adjustment Loss for Long-tailed Classifier

Balanced Contrastive Learning for Long-Tailed Visual Recognition

Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning.

Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning

Bt-Vmf Contrastive and Collaborative Learning for Long-Tailed Visual Recognition

Fed-GraB: Federated Long-tailed Learning with Self-Adjusting Gradient Balancer

Deep Long-Tailed Learning: A Survey

Long-tailed Visual Recognition via Gaussian Clouded Logit Adjustment

Long-Tailed Recognition via Weight Balancing

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

LTRL: Boosting Long-tail Recognition via Reflective Learning

Improving Visual Prompt Tuning by Gaussian Neighborhood Minimization for Long-Tailed Visual Recognition

A Deep Learning Model for Long-Tail Visual Recognition

F BGD : Learning Embeddings from Positive Unlabeled Data with BGD.