Abstract:Data in the real world tends to exhibit a long-tailed label distribution, which poses great challenges for the training of neural networks in visual recognition. Existing methods tackle this problem mainly from the perspective of data quantity, i.e., the number of samples in each class. To be specific, they pay more attention to tail classes, like applying larger adjustments to the logit. However, in the training process, the quantity and difficulty of data are two intertwined and equally crucial problems. For some tail classes, the features of their instances are distinct and discriminative, which can also bring satisfactory accuracy; for some head classes, al-though with sufficient samples, the high semantic similarity with other classes and lack of discriminative features will bring bad accuracy. Based on these observations, we propose Adaptive Logit Adjustment Loss (ALA Loss) to ap-ply an adaptive adjusting term to the logit. The adaptive adjusting term is composed of two complementary factors: 1) quantity factor, which pays more attention to tail classes, and 2) difficulty factor, which adaptively pays more attention to hard instances in the training process. The difficulty factor can alleviate the over-optimization on tail yet easy in-stances and under-optimization on head yet hard instances. The synergy of the two factors can not only advance the performance on tail classes even further, but also promote the ac-curacy on head classes. Unlike previous logit adjusting methods that only concerned about data quantity, ALA Loss tackles the long-tailed problem from a more comprehensive, fine-grained and adaptive perspective. Extensive experimental re-sults show that our method achieves the state-of-the-art performance on challenging recognition benchmarks, including ImageNet-LT, iNaturalist 2018, and Places-LT.

Improving Visual Prompt Tuning by Gaussian Neighborhood Minimization for Long-Tailed Visual Recognition

Class-Conditional Sharpness-Aware Minimization for Deep Long-Tailed Recognition

Long-tailed Visual Recognition via Gaussian Clouded Logit Adjustment

LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

Adjusting Logit in Gaussian Form for Long-Tailed Visual Recognition

Bag of Tricks for Long-Tailed Visual Recognition with Deep Convolutional Neural Networks

ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition

Long-tailed Visual Recognition with Deep Models: A Methodological Survey and Evaluation

CR-SAM: Curvature Regularized Sharpness-Aware Minimization

A Dual Progressive Strategy for Long-Tailed Visual Recognition

Margin Calibration for Long-Tailed Visual Recognition

Improving Resistance to Noisy Label Fitting by Reweighting Gradient in SAM

Towards Efficient and Scalable Sharpness-Aware Minimization

Calibrating Class Activation Maps for Long-Tailed Visual Recognition

Gradient Constrained Sharpness-Aware Prompt Learning for Vision-Language Models

SAU: A Dual-Branch Network to Enhance Long-Tailed Recognition via Generative Models

Friendly Sharpness-Aware Minimization

GA-SAM: Gradient-Strength based Adaptive Sharpness-Aware Minimization for Improved Generalization

Adaptive Logit Adjustment Loss for Long-Tailed Visual Recognition