An Adaptive Policy to Employ Sharpness-Aware Minimization

Weisen Jiang,Hansi Yang,Yu Zhang,James Kwok
DOI: https://doi.org/10.48550/arXiv.2304.14647
IF: 5.414
2023-04-28
Machine Learning
Abstract:Sharpness-aware minimization (SAM), which searches for flat minima by min-max optimization, has been shown to be useful in improving model generalization. However, since each SAM update requires computing two gradients, its computational cost and training time are both doubled compared to standard empirical risk minimization (ERM). Recent state-of-the-arts reduce the fraction of SAM updates and thus accelerate SAM by switching between SAM and ERM updates randomly or periodically. In this paper, we design an adaptive policy to employ SAM based on the loss landscape geometry. Two efficient algorithms, AE-SAM and AE-LookSAM, are proposed. We theoretically show that AE-SAM has the same convergence rate as SAM. Experimental results on various datasets and architectures demonstrate the efficiency and effectiveness of the adaptive policy.
What problem does this paper attempt to address?