Boosting sharpness-aware training with dynamic neighborhood

Junhong Chen,Hong Li,C.L. Philip Chen
DOI: https://doi.org/10.1016/j.patcog.2024.110496
IF: 8
2024-04-20
Pattern Recognition
Abstract:Learning algorithms motivated by minimizing the sharpness of loss surface is a hot research topic in improving generalization. The existing methods usually solve a constrained min–max problem to minimize sharpness and find flat minima. However, most constraints (i.e., the neighborhood of the sharpness) are inappropriate, leading to sub-optimal results. This paper theoretically explores the optimal neighborhood from the view of Probably Approximately Correct-Bayesian (PAC-Bayesian) framework. A closed form of the optimal neighborhood is provided. This neighborhood is determined by the Hessian matrix and the scales of parameters. Then a generalization bound is derived that serves as a guiding principle in the design of the sharpness minimization algorithm. The Dynamic neighborhood-based Sharpness-Aware Minimization algorithm is proposed, which can adaptively adjust the neighborhood during the training process to gain better performance. Also, the algorithm is proved can convergent at the rate O(logT/T) . Experimental results demonstrate that the proposed algorithm outperforms the other methods (e.g., accuracy +2.86% over baseline on CIFAR-100 for VGG-16).
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?