Accelerated Stochastic Gradient Descent with Step Size Selection Rules.

Zhuang Yang,Cheng Wang,Zhemin Zhang,Jonathan Li
DOI: https://doi.org/10.1016/j.sigpro.2019.02.010
IF: 4.729
2019-01-01
Signal Processing
Abstract:Accelerated stochastic gradient descent (ASGD) methods, which incorporate accelerated proximal gradient (APG) and stochastic gradient (SG), have received considerable attention recently for solving regularized risk minimization problems in signal/image processing, statistics and machine learning. However, there has been a paucity of practical guidance proposed for resolving one of the major issues in ASGD: how to choose an appropriate step size. To solve this problem, we propose to use the Barzilai-Borwein (BB) method to automatically compute step size for the accelerated mini-batch Prox-SVRG (Acc-Prox-SVRG) method (the state of the art ASGD method), thereby obtaining a new accelerated method: Acc-Prox-SVRGBB. We prove the convergence of Acc-Prox-SVRG-BB and show that its complexity is comparable with the best known stochastic gradient methods. In addition, we incorporate Beck and Teboulle's APG (FISTA) and Prox-SVRG in a mini-batch setting and obtain another new accelerated gradient descent method, FISTA-Prox-SVRG, which requires the selection of fewer unknown parameters than those required in Acc-Prox-SVRG. Finally, we introduce the BB method into FISTA-Prox-SVRG to further show the efficacy of the BB method. Numerical results demonstrate the advantage of our algorithms. (C) 2019 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?