Simplifying Neural Network Training Under Class Imbalance

Ravid Shwartz-Ziv,Micah Goldblum,Yucen Lily Li,C. Bayan Bruss,Andrew Gordon Wilson
2023-12-05
Abstract:Real-world datasets are often highly class-imbalanced, which can adversely impact the performance of deep learning models. The majority of research on training neural networks under class imbalance has focused on specialized loss functions, sampling techniques, or two-stage training procedures. Notably, we demonstrate that simply tuning existing components of standard deep learning pipelines, such as the batch size, data augmentation, optimizer, and label smoothing, can achieve state-of-the-art performance without any such specialized class imbalance methods. We also provide key prescriptions and considerations for training under class imbalance, and an understanding of why imbalance methods succeed or fail.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: when training deep - learning models on class - imbalance datasets, how to achieve performance comparable to specially - designed imbalance - handling methods by adjusting the components in the existing standard deep - learning pipeline (such as batch size, data augmentation, optimizer, and label smoothing, etc.). ### Background Problem Datasets in the real world usually have a high degree of class - imbalance problems. For example, only a very small number of credit card transactions are fraudulent, and most cancer screening results are negative. This imbalance will have a negative impact on the performance of deep - learning models. Therefore, much research has focused on developing loss functions, sampling techniques, or two - stage training procedures specifically for class - imbalance. ### Core Contributions of the Paper However, this paper proposes a different approach: **by only adjusting the components in the existing standard deep - learning training process, without using complex methods specifically designed for class - imbalance, the state - of - the - art performance can be achieved**. Specifically: 1. **Batch Size**: A small batch size performs better in a class - imbalance setting. 2. **Data Augmentation**: Data augmentation has a greater impact on class - imbalance data, especially on the accuracy of the minority class. 3. **Model Architecture**: Larger models perform well on class - balanced data, but are prone to overfitting on the same - scale imbalance data. 4. **Self - Supervised Learning (SSL)**: Introducing self - supervised loss during the training process can improve feature representation and generalization ability. 5. **Sharpness - Aware Minimization (SAM)**: By increasing the flatness of the minority class, the decision boundary is improved and overfitting is prevented. 6. **Label Smoothing**: Applying more label smoothing to minority - class samples to prevent overfitting. ### Experimental Verification The authors conducted extensive experiments on multiple benchmark datasets (including image and tabular data) to verify the effectiveness of these adjustments and demonstrated their robustness in practical application scenarios. In addition, they also provided a detailed analysis to explain why these adjustments can significantly improve performance. ### Summary The main contribution of this paper lies in proving that by simply adjusting the existing deep - learning training components, the class - imbalance problem can be effectively addressed without using complex and specially - designed methods. This finding provides researchers and practitioners with a simpler and more efficient way to handle class - imbalance datasets.