Adaptive learning rate optimization algorithms with dynamic bound based on Barzilai-Borwein method

Zhi-Jun Wang,He-Bei Gao,Xiang-Hong Wang,Shuai-Ye Zhao,Hong Li,Xiao-Qin Zhang
DOI: https://doi.org/10.1016/j.ins.2023.03.050
IF: 8.1
2023-07-01
Information Sciences
Abstract:The training effect of the neural network model is directly influenced by optimization algorithms. The Barzilai-Borwein(BB) method is used in the stochastic gradient descent (SGD) and other deep learning optimization algorithms because of its outstanding performance in terms of convergence speed. In order to improve the stability of BB step size and the output quality of network model in deep learning, this paper presents two improved optimization algorithms based on the BB method: BBbound and AdaBBbound. BBbound reduces the floating BB step size by generating the upper bound of the step in the current iteration. It avoids the occasional occurrence of long steps; AdaBBbound is an adaptive gradient method based on BB method with dynamic bound. The algorithm has a fast convergence effect in the early stage, then calculates learning rate and smoothly transitions to SGD by setting the better conditions of the BB method. We analyzed the performance of the two algorithms based on the initial conditions and the learning rate change curve. Meanwhile, we tested our algorithms on popular network models such as ResNet and DenseNet. The results showed that the new optimization algorithms have achieved high stability and significantly improved the performance of the network model.
computer science, information systems
What problem does this paper attempt to address?