Abstract:Many problems arising in machine learning can be finally reduced to optimization problems.Convex optimization algorithms have been successfully adapted in various kinds of learning optimization problems.And whether the optimal convergence rate can be attained is one of the basic problems in the study of optimization algorithms.Besides,sparsity is another concern in sparse learning problems.So far,a great deal of stochastic optimization algorithms have been presented for solving the large scale learning problems.However,most of the-state-of-the-arts stochastic optimization algorithms only attain the optimal convergence rates in terms of the averaged output,and the desired sparsity can not be guaranteed.In contrast to the averaged output,the individual solution usually offers more sufficient sparsity.Unfortunately,it is not easy to make the individual convergence rate optimal and the optimal individual convergence rate in strongly-convex cases has been extensively exploring as an open problem.For solving smooth objective optimization problems,it is well known that the step-size rule raised by the famous researcher Nesterov's can accelerate the convergence rate of the first order gradient algorithm by orders of magnitude,and the optimal individual convergence rate are simultaneously derived.Recently,Nesterov's acceleration algorithm has been commonly applied in various learning optimization problem with smooth loss functions,and a large number of stochastic optimization algorithms in smooth cases have been developed based on the Nesterov's acceleration strategy.Obviously,whether the Nesterov's step-size rule can be extended to obtain the optimal individual convergence rate for nonsmooth objective optimization problems is an interesting problem.In this paper,the Nesterov's step-size rule in smooth objective cases is incorporated into the gradient method for solving nonsmooth objective optimization problems.In particular,focusing on the classic first order gradient methods,we present a new projected subgradient method with the Nesterov's step-size rule.It is proved that the proposed method can achieve the optimal individual convergence rate when solving nonsmooth optimization problems.Such conclusion is stronger than the previous one that the regular projected subgradient method can obtain the optimal convergence result only in terms of the averaged output.And it can also be regarded as an approximate answer to the question of whether first order gradient methods can achieve the optimal individual convergence rate.Compared with the regular projected subgradient methods in which the averaged output is used or the modified projected subgradient methods in which the linear interpolation operation is employed,the subgradient-like operation follows the extrapolation evaluation in our method,which brings significant benefits in keeping the sufficient sparsity when solving the hinge loss function optimization problems on an l1-norm ball.The experiments on two synthetic datasets verify that our theoretical analysis is correct,and the experiments on several benchmark datasets demonstrate that the proposed methods have almost the same convergence behavior but offer more sufficient sparsity.As future work,the optimal individual convergence in regularized sparse learning problems and the stability of individual convergence in stochastic optimization will be considered.Moreover,by using the Nesterov's step-size rule,whether the optimal individual convergence for strongly-convex objective functions can be achieved will be investigated.

A Nesterov's Accelerated Projected Gradient Method for Monotone Variational Inequalities

A Trust Region-Type Method for Solving Monotone Variational Inequality

An extra gradient Anderson-accelerated algorithm for pseudomonotone variational inequalities

A feasible smoothing accelerated projected gradient method for nonsmooth convex optimization

Adaptive Methods or Variational Inequalities with Relatively Smooth and Reletively Strongly Monotone Operators

An accelerated stochastic extragradient-like algorithm with new stepsize rules for stochastic variational inequalities

Novel projected gradient methods for solving pseudomontone variational inequality problems with applications to optimal control problems

Accelerating Nesterov's Method for Strongly Convex Functions with Lipschitz Gradient

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Linear Convergence Results for Inertial Type Projection Algorithm for Quasi-Variational Inequalities

A Fast Optimistic Method for Monotone Variational Inequalities

The Individual Convergence of Projected Subgradient Methods Using the Nesterov's Step-Size Strategy

Convergence Analysis of Projection Method for Variational Inequalities

Projection and contraction method with double inertial steps for quasi-monotone variational inequalities

Projection-type method with line-search process for solving variational inequalities

The Global R-linear Convergence of Nesterov's Accelerated Gradient Method with Unknown Strongly Convex Parameter

A Note on Nesterov's Accelerated Method in Nonconvex Optimization: a Weak Estimate Sequence Approach

Strong and linear convergence of projection-type method with an inertial term for finding minimum-norm solutions of pseudomonotone variational inequalities in Hilbert spaces

A class modified projection algorithms for nonmonotone variational inequalities with continuity

On the convergence properties of non-Euclidean extragradient methods for variational inequalities with generalized monotone operators

An inertial Tseng's extragradient method for solving multi-valued variational inequalities with one projection