Global Convergence of Noisy Gradient Descent.

Xuliang Qin,Xin Xu,Xiaopeng Luo
DOI: https://doi.org/10.1109/SMC53654.2022.9945497
2022-01-01
Abstract:Noise plays an important role in the gradient-based optimization methods, and a series of numerical experiments have demonstrated that adding gradient noise improves learning for neural networks. However, the mathematical interpretation of the noise remains a challenge. In this paper, we show that, the noise variation can be regarded as a smoothing factor, and we prove that, under certain conditions, a noisy gradient descent (NG) enjoys linear global convergence in expectation sense. We contribute to this problem by introducing an intermediate which connect the NG method to the smoothed function. On the one hand, this connection reveals that applying the NG method to a function is the same as applying the gradient method to the corresponding function smoothed by the noise; and on the other hand, it allows us to establish the convergence behavior of the NG in a global sense. Moreover, we also consider what conditions make the global minimizer of the smoothed function not far from the original global minimizer.
What problem does this paper attempt to address?