Abstract:This paper studies first-order algorithms for solving fully composite optimization problems over convex and compact sets. We leverage the structure of the objective by handling its differentiable and non-differentiable components separately, linearizing only the smooth parts. This provides us with new generalizations of the classical Frank-Wolfe method and the Conditional Gradient Sliding algorithm, that cater to a subclass of non-differentiable problems. Our algorithms rely on a stronger version of the linear minimization oracle, which can be efficiently implemented in several practical applications. We provide the basic version of our method with an affine-invariant analysis and prove global convergence rates for both convex and non-convex objectives. Furthermore, in the convex case, we propose an accelerated method with correspondingly improved complexity. Finally, we provide illustrative experiments to support our theoretical results.

What problem does this paper attempt to address?

This paper aims to solve a class of fully composite optimization problems, which are in the form of: \[ \min_{x \in X} \phi(x) \triangleq F(f(x), x) \] where \(X\) is a convex and compact set, \(F: \mathbb{R}^n\times X\rightarrow\mathbb{R}\) is a possibly non - smooth simple convex function, and \(f: X\rightarrow\mathbb{R}^n\) is a smooth mapping and the main source of computational burden. This kind of problems widely exists in applications and generalizes many classical composite optimization use cases. The paper proposes a new method. By taking advantage of the different smooth and non - smooth parts of the objective function and only linearizing the smooth part, it provides a new generalization of the classical Frank - Wolfe method and the Conditional Gradient Sliding algorithm, which is applicable to a class of non - smooth problems. This method relies on a stronger version of the linear minimization oracle and can be efficiently implemented in many practical applications. Specifically, the main contributions of the paper include: - Proposing a basic method to solve the above problems. This method has affine invariance and provides a global convergence rate analysis. The convergence rate is \(O(1/k)\) for convex objective functions and \(\tilde{O}(1/\sqrt{k})\) for non - convex objective functions. - Proposing an accelerated method. In the convex case, it achieves a convergence rate of \(O(1/k^2)\), and the algorithm achieves the optimal \(O(\epsilon^{- 1/2})\) oracle complexity in terms of the number of times of computing \(\nabla f\). - Verifying the effectiveness of the proposed methods through numerical experiments. In general, this paper attempts to effectively solve non - smooth and non - convex optimization problems with specific structures by proposing a new algorithm framework, especially those problems whose solution process can be simplified by linearizing the smooth part. These methods not only have good convergence properties in theory but also show high efficiency in practical applications.

Linearization Algorithms for Fully Composite Optimization

Linearized Proximal Algorithms with Adaptive Stepsizes for Convex Composite Optimization with Applications

A Unified Algorithmic Framework for Distributed Composite Optimization.

Distributed Algorithms for Composite Optimization: Unified Framework and Convergence Analysis

On Convergence Rates of Linearized Proximal Algorithms for Convex Composite Optimization with Applications.

A Unified Contraction Analysis of a Class of Distributed Algorithms for Composite Optimization

Complementary Composite Minimization, Small Gradients in General Norms, and Applications

Linear convergence of first order methods for non-strongly convex optimization

Gradient sliding for composite optimization

Nonsmooth, Nonconvex Optimization Using Functional Encoding and Component Transition Information

Stochastic subgradient for composite optimization with functional constraints

Accelerated First-Order Optimization under Nonlinear Constraints

A Linearly Convergent Conditional Gradient Algorithm with Applications to Online and Stochastic Optimization

A Smoothing Stochastic Gradient Method for Composite Optimization

Linear-memory and Decomposition-invariant Linearly Convergent Conditional Gradient Algorithm for Structured Polytopes

On Linear Convergence in Smooth Convex-Concave Bilinearly-Coupled Saddle-Point Optimization: Lower Bounds and Optimal Algorithms

First-order Methods for Affinely Constrained Composite Non-convex Non-smooth Problems: Lower Complexity Bound and Near-optimal Methods

Formalization of Complexity Analysis of the First-order Algorithms for Convex Optimization

A Nonlinear Bregman Primal-Dual Framework for Optimizing Nonconvex Infimal Convolutions

First Order Methods beyond Convexity and Lipschitz Gradient Continuity with Applications to Quadratic Inverse Problems

A Smooth Primal-Dual Optimization Framework for Nonsmooth Composite Convex Minimization