Abstract:We consider the problem of minimizing a convex separable objective (as a separable sum of two proper closed convex functions $f$ and $g$) over a linear coupling constraint. We assume that $f$ can be decomposed as the sum of a smooth part having Hölder continuous gradient (with exponent $\mu\in(0,1]$) and a nonsmooth part that admits efficient proximal mapping computations, while $g$ can be decomposed as the sum of a smooth part having Hölder continuous gradient (with exponent $\nu\in(0,1]$) and a nonsmooth part that admits efficient linear oracles. Motivated by the recent work [41], we propose a single-loop variant of the standard penalty method, which we call a single-loop proximal-conditional-gradient penalty method ($proxCG_{1\ell}^{pen}$), for this problem. In each iteration, we successively perform one proximal-gradient step involving $f$ and one conditional-gradient step involving $g$ on the quadratic penalty function, followed by an update of the penalty parameter. We present explicit rules for updating the penalty parameter and the stepsize in the conditional-gradient step. Under a standard constraint qualification and domain boundedness assumption, we show that the objective value deviations (from the optimal value) along the sequence generated decay in the order of $t^{-\min\{\mu,\nu,1/2\}}$ with the associated feasibility violations decaying in the order of $t^{-1/2}$. Moreover, if the nonsmooth parts are indicator functions and the extended objective is a Kurdyka-Lojasiewicz function with exponent $\alpha\in [0,1)$, then the distances to the optimal solution set along the sequence generated by $proxCG_{1\ell}^{pen}$ decay asymptotically at a rate of $t^{-(1-\alpha)\min\{\mu,\nu,1/2\}}$. Finally, we illustrate numerically the convergence behavior of $proxCG_{1\ell}^{pen}$ on minimizing the $\ell_1$ norm subject to a residual error measured by $\ell_p$ norm, $p\in(1,2]$.

Splitting the Conditional Gradient Algorithm

A Conditional Gradient Framework for Composite Convex Minimization with Applications to Semidefinite Programming

Decomposable Non-Smooth Convex Optimization with Nearly-Linear Gradient Oracle Complexity

Zero-Order Stochastic Conditional Gradient Sliding Method for Non-smooth Convex Optimization

Conditional Gradient Methods for Convex Optimization with General Affine and Nonlinear Constraints

Linear-memory and Decomposition-invariant Linearly Convergent Conditional Gradient Algorithm for Structured Polytopes

A Linearly Convergent Conditional Gradient Algorithm with Applications to Online and Stochastic Optimization

Model Function Based Conditional Gradient Method with Armijo-like Line Search

Second-order Conditional Gradient Sliding

A single-loop proximal-conditional-gradient penalty method

On a Frank-Wolfe approach for abs-smooth functions

An Accelerated Variance-Reduced Conditional Gradient Sliding Algorithm for First-order and Zeroth-order Optimization

Adaptive generalized conditional gradient method for multiobjective optimization

A Gradient Complexity Analysis for Minimizing the Sum of Strongly Convex Functions with Varying Condition Numbers

Non-Uniform Smoothness for Gradient Descent

Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization

Fast Stochastic Composite Minimization and an Accelerated Frank-Wolfe Algorithm under Parallelization

Conditional Accelerated Lazy Stochastic Gradient Descent

Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization

New Aspects of Black Box Conditional Gradient: Variance Reduction and One Point Feedback