Scaling of Class-wise Training Losses for Post-hoc Calibration

Seungjin Jung,Seungmo Seo,Yonghyun Jeong,Jongwon Choi
2023-06-19
Abstract:The class-wise training losses often diverge as a result of the various levels of intra-class and inter-class appearance variation, and we find that the diverging class-wise training losses cause the uncalibrated prediction with its reliability. To resolve the issue, we propose a new calibration method to synchronize the class-wise training losses. We design a new training loss to alleviate the variance of class-wise training losses by using multiple class-wise scaling factors. Since our framework can compensate the training losses of overfitted classes with those of under-fitted classes, the integrated training loss is preserved, preventing the performance drop even after the model calibration. Furthermore, our method can be easily employed in the post-hoc calibration methods, allowing us to use the pre-trained model as an initial model and reduce the additional computation for model calibration. We validate the proposed framework by employing it in the various post-hoc calibration methods, which generally improves calibration performance while preserving accuracy, and discover through the investigation that our approach performs well with unbalanced datasets and untuned hyperparameters.
Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of poor prediction calibration caused by the imbalance of different class - wise training losses during the training process of deep - learning models. Specifically, due to different degrees of intra - class and inter - class appearance variations, the training losses of classes tend to diverge, resulting in uncalibrated prediction. To solve this problem, the author proposes a new calibration method - **class - wise loss scaling** - to synchronize the training losses of different classes. #### Key issues: 1. **Divergence of intra - class losses**: Due to differences in intra - class and inter - class appearance changes, the training losses of different classes will diverge, leading to a decrease in the reliability of model predictions. 2. **Limitations of existing calibration methods**: Existing calibration methods (such as temperature scaling, parameterized temperature scaling, etc.) usually require a large amount of additional computation and may lead to performance degradation. 3. **Challenges of posterior calibration**: Posterior calibration methods can reduce the amount of computation by using pre - trained models, but often lead to performance degradation because the over - confident predictions of pre - trained models are difficult to recover through limited parameter adjustments. ### Solution: The author proposes a new calibration mechanism called **class - wise loss scaling**. Its core idea is to reduce the variance of training losses of different classes by controlling the proportion of intra - class losses, thereby achieving better prediction calibration. The specific steps include: 1. **Analyze the relationship between intra - class losses and calibration errors**: Through experiments, it is found that the variance of intra - class losses is highly correlated with calibration errors. Therefore, the calibration error can be reduced by reducing the variance of intra - class losses. 2. **Design intra - class loss scaling factors**: Introduce multiple intra - class loss scaling factors to adjust the training losses of different classes so that the overall training losses remain unchanged and performance degradation is avoided. 3. **Combine with posterior calibration methods**: This method can be easily applied to various posterior calibration methods, allowing the use of pre - trained models as initial models, reducing additional computational requirements. Through this method, the author not only improves the calibration performance of the model but also can maintain the original accuracy of the model to a certain extent. In addition, this method also shows stable convergence in the case of unbalanced data sets and fixed hyper - parameters. ### Summary: The main contribution of this paper is to propose a new intra - class loss scaling mechanism, which solves the calibration problem caused by the imbalance of intra - class losses in deep - learning models. This method not only improves the calibration performance but also can reduce the computational overhead without significantly reducing the accuracy of the model and is applicable to multiple posterior calibration methods.