Abstract:Real-world datasets often exhibit a long-tailed distribution, where vast majority of classes known as tail classes have only few samples. Traditional methods tend to overfit on these tail classes. Recently, a new approach called Imbalanced SAM (ImbSAM) is proposed to leverage the generalization benefits of Sharpness-Aware Minimization (SAM) for long-tailed distributions. The main strategy is to merely enhance the smoothness of the loss function for tail classes. However, we argue that improving generalization in long-tail scenarios requires a careful balance between head and tail classes. We show that neither SAM nor ImbSAM alone can fully achieve this balance. For SAM, we prove that although it enhances the model's generalization ability by escaping saddle point in the overall loss landscape, it does not effectively address this for tail-class losses. Conversely, while ImbSAM is more effective at avoiding saddle points in tail classes, the head classes are trained insufficiently, resulting in significant performance drops. Based on these insights, we propose Stage-wise Saddle Escaping SAM (SSE-SAM), which uses complementary strengths of ImbSAM and SAM in a phased approach. Initially, SSE-SAM follows the majority sample to avoid saddle points of the head-class loss. During the later phase, it focuses on tail-classes to help them escape saddle points. Our experiments confirm that SSE-SAM has better ability in escaping saddles both on head and tail classes, and shows performance improvements.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the class - imbalance problem in long - tailed distribution datasets, especially the generalization performance difference between head classes and tail classes in deep - learning models. Specifically: 1. **The balance problem between head classes and tail classes**: - In actual datasets, head classes have a large number of samples, while tail classes have scarce samples. Traditional methods are prone to over - fitting tail classes during training, resulting in poor performance of the model on tail classes. - Sharpness - Aware Minimization (SAM) and Imbalanced SAM (ImbSAM) are two existing optimization methods, but they each have limitations: - Although SAM can effectively help the model escape from saddle points, its effect on tail classes is not obvious. - ImbSAM can better handle the saddle - point problem of tail classes, but it will lead to insufficient training of head classes. 2. **The proposed new method**: - The paper proposes Stage - wise Saddle Escaping SAM (SSE - SAM), which is a phased method aiming to combine the advantages of SAM and ImbSAM and gradually optimize the saddle - point escaping ability of head classes and tail classes. - The main contribution of SSE - SAM lies in improving the overall performance by ensuring the effective escape of head classes from saddle points through two - stage training first, and then focusing on the optimization of tail classes. 3. **Theoretical analysis and experimental verification**: - The paper theoretically analyzes the advantages and disadvantages of SAM and ImbSAM in escaping saddle points, and verifies the superior performance of SSE - SAM on different datasets through experiments, especially on long - tailed distribution datasets such as CIFAR - 100 - LT, CIFAR - 10 - LT and ImageNet - LT. In summary, the core problem of this paper is to improve the generalization ability of deep - learning models in long - tailed distribution datasets by improving the optimization algorithm, especially to balance the performance of head classes and tail classes.

SSE-SAM: Balancing Head and Tail Classes Gradually through Stage-Wise SAM

ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition

Long-Tailed Learning as Multi-Objective Optimization

To Balance or Not to Balance: A Simple-yet-Effective Approach for Learning with Long-Tailed Distributions

Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy

Seesaw Loss for Long-Tailed Instance Segmentation

BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model

Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training

SWRM: Similarity Window Reweighting and Margin for Long-Tailed Recognition

SAM Fails to Segment Anything? – SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More

The SAMME.C2 algorithm for severely imbalanced multi-class classification

Friendly Sharpness-Aware Minimization

Bilateral Sharpness-Aware Minimization for Flatter Minima

SAFA: Sample-Adaptive Feature Augmentation for Long-Tailed Image Classification

Uncertainty-aware Sampling for Long-tailed Semi-supervised Learning

Boosting Long-tailed Object Detection via Step-wise Learning on Smooth-tail Data

Constructing Balance from Imbalance for Long-Tailed Image Recognition.

Reweighting Local Mimina with Tilted SAM

Relieving Long-tailed Instance Segmentation via Pairwise Class Balance

SU-SAM: A Simple Unified Framework for Adapting Segment Anything Model in Underperformed Scenes

Reviving Undersampling for Long-Tailed Learning