AdaGC: A Novel Adaptive Optimization Algorithm with Gradient Bias Correction

Qi Wang,Feng Su,Shipeng Dai,Xiaojun Lu,Yang Liu
DOI: https://doi.org/10.1016/j.eswa.2024.124956
IF: 8.5
2024-01-01
Expert Systems with Applications
Abstract:A proper optimization algorithm is very important for solving the parameters of neural networks. People always want to train neural networks as fast as possible and obtain optimal parameters, while existing optimization algorithms are still far from perfect in both efficiency and convergence. In this paper, we propose an Adaptive optimization algorithm with Gradient bias Correction (AdaGC) for training neural networks. In this algorithm, the iterative direction is improved utilizing the gradient deviation and momentum. The step size is adaptively revised using the second-order moment of gradient deviation. Intuitively, when the iterative vector changes significantly, the iterative vector is corrected by enlarging the effect of iterative deviation and reducing the effect of momentum. At the same time, the step size will decrease correspondingly based on the second-order moment of gradient deviation. When the iterative vector changes slightly, the gradient deviation also decreases, and the influence of momentum will increase at this time. The second-order moment of gradient deviation is correspondingly reduced, resulting in a large step size. In this case, a large iterative step is probably to reduce the iterative time and enlarge the rate of convergence. Furthermore, rigorous theoretical analysis is provided to prove the convergence of the AdaGC algorithm in both convex and non-convex optimization problems. The proposed algorithm is also validated through extensive experimentation. It has better performance and faster convergence for different networks and applications. Code is available at https://github.com/breeze7-opt/optimizer.
What problem does this paper attempt to address?