Normalized Stochastic Heavy Ball with Adaptive Momentum1

Z. G. Wen,Xiaoge Deng,Tao Sun,Dongsheng Li
DOI: https://doi.org/10.3233/faia230568
2023-01-01
Abstract:The heavy ball momentum technique is widely used in accelerating the machine learning training process, which has demonstrated significant practical success in optimization tasks. However, most heavy ball methods require a preset hyperparameter that will result in excessive tuning, and a calibrated fixed hyperparameter may not lead to optimal performance. In this paper, we propose an adaptive criterion for the choice of the normalized momentum-related hyperparameter, motivated by the quadratic optimization training problem, to eliminate the adverse for tuning the hyperparameter and thus allow for a computationally efficient optimizer. We theoretically prove that our proposed adaptive method promises convergence for L-Lipschitz functions. In addition, we verify its practical efficiency on existing extensive machine learning benchmarks for image classification tasks. The numerical results show that besides the speed improvement, our proposed methods enjoy advantages, including more robust to large learning rates and better generalization.
What problem does this paper attempt to address?