Abstract:This paper considers the minimization of a sum of an expectation-valued smooth nonconvex function and a nonsmooth block-separable convex regularizer. By combining a randomized block-coordinate descent method with a proximal variable sample-size stochastic gradient (VSSG) method, we propose a randomized block proximal VSSG algorithm. In each iteration, a single block is randomly chosen to updates its estimates by {a VSSG scheme} with an increasing batch of sampled gradients, while the remaining blocks are kept invariant. By appropriately chosen batch sizes, we prove that every limit point for almost every sample path is a stationary point when blocks are chosen either randomly or cyclically. We further show that the ergodic mean-squared error of the gradient mapping {diminishes at the rate of $\mathcal{O}(1/K) $ where $K$denotes the iteration index} and establish that the iteration and oracle complexity to obtain an $\epsilon$-stationary point are $\mathcal{O}(1/\epsilon )$ and $\mathcal{O}(1/\epsilon^2)$, respectively. Furthermore, under a $ {\mu}$-proximal Polyak-{\L}ojasiewicz condition with the batch size increasing at a suitable geometric rate, we prove that the suboptimality diminishes at a {\em geometric} rate, the {\em optimal} deterministic rate. In addition, if $L_{\rm ave}$ denotes the average of block-specific Lipschitz constants, the iteration and oracle complexity to obtain an $\epsilon$-optimal solution are $\mathcal{O}( {(L_{\rm ave}/\mu)}\ln(1/\epsilon))$ and $\mathcal{O}\left( (1/\epsilon)^{1+c} \right)$, respectively, {matching} the deterministic result. When $n=1$, we obtainthe {\em optimal} \red{oracle complexity bound} $\mathcal{O}(1/\epsilon) $ while $c>0$ when $n\geq 2$ represents the positive cost of multiple blocks. Finally, preliminary numerical experiments support our theoretical findings.

Nonconvex Stochastic Bregman Proximal Gradient Method for Nonconvex Composite Problems

Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning

A Bregman Proximal Stochastic Gradient Method with Extrapolation for Nonconvex Nonsmooth Problems

A Bregman Stochastic Method for Nonconvex Nonsmooth Problem Beyond Global Lipschitz Gradient Continuity.

Approximate Bregman Proximal Gradient Algorithm for Relatively Smooth Nonconvex Optimization

Bregman Proximal Gradient Algorithm with Extrapolation for a Class of Nonconvex Nonsmooth Minimization Problems

Momentum-based variance-reduced stochastic Bregman proximal gradient methods for nonconvex nonsmooth optimization

Stochastic Bregman Proximal Gradient Method Revisited: Kernel Conditioning and Painless Variance Reduction

Stochastic Bregman Subgradient Methods for Nonsmooth Nonconvex Optimization Problems

Accelerated Bregman Proximal Gradient Methods for Relatively Smooth Convex Optimization

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization.

Smoothing randomized block-coordinate proximal gradient algorithms for nonsmooth nonconvex composite optimization

Barzilai-Borwein Proximal Gradient Methods for Multiobjective Composite Optimization Problems with Improved Linear Convergence

Inexact Bregman Proximal Gradient Method and its Inertial Variant with Absolute and Partial Relative Stopping Criteria

Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization

Adaptive smoothing mini-batch stochastic accelerated gradient method for nonsmooth convex stochastic composite optimization

A Single-Loop Stochastic Proximal Quasi-Newton Method for Large-Scale Nonsmooth Convex Optimization

A Randomized Block Proximal Variable Sample-size Stochastic Gradient Method for Composite Nonconvex Stochastic Optimization

Non-Convex Stochastic Composite Optimization with Polyak Momentum

Stochastic smoothing accelerated gradient method for nonsmooth convex composite optimization