Abstract:The optimistic gradient method has seen increasing popularity for solving convex-concave saddle point problems. To analyze its iteration complexity, a recent work [<a class="link-https" data-arxiv-id="1906.01115" href="https://arxiv.org/abs/1906.01115">arXiv:1906.01115</a>] proposed an interesting perspective that interprets this method as an approximation to the proximal point method. In this paper, we follow this approach and distill the underlying idea of optimism to propose a generalized optimistic method, which includes the optimistic gradient method as a special case. Our general framework can handle constrained saddle point problems with composite objective functions and can work with arbitrary norms using Bregman distances. Moreover, we develop a backtracking line search scheme to select the step sizes without knowledge of the smoothness coefficients. We instantiate our method with first-, second- and higher-order oracles and give best-known global iteration complexity bounds. For our first-order method, we show that the averaged iterates converge at a rate of $O(1/N)$ when the objective function is convex-concave, and it achieves linear convergence when the objective is strongly-convex-strongly-concave. For our second- and higher-order methods, under the additional assumption that the distance-generating function has Lipschitz gradient, we prove a complexity bound of $O(1/\epsilon^\frac{2}{p+1})$ in the convex-concave setting and a complexity bound of $O((L_pD^\frac{p-1}{2}/\mu)^\frac{2}{p+1}+\log\log\frac{1}{\epsilon})$ in the strongly-convex-strongly-concave setting, where $L_p$ ($p\geq 2$) is the Lipschitz constant of the $p$-th-order derivative, $\mu$ is the strong convexity parameter, and $D$ is the initial Bregman distance to the saddle point. Moreover, our line search scheme provably only requires a constant number of calls to a subproblem solver per iteration on average, making our first- and second-order methods particularly amenable to implementation.

What problem does this paper attempt to address?

This paper attempts to address convex-concave saddle point problems, also known as minimax optimization problems. Specifically, the paper focuses on saddle point problems with composite structured objective functions: \[ \min_{x \in X} \max_{y \in Y} \ell(x, y) := f(x, y) + h_1(x) - h_2(y) \] where $X \subset \mathbb{R}^m$ and $Y \subset \mathbb{R}^n$ are non-empty closed convex sets, $h_1: \mathbb{R}^m \to (-\infty, +\infty]$ and $h_2: \mathbb{R}^n \to (-\infty, +\infty]$ are proper closed convex functions, and $f$ is a smooth function defined on an open set containing $X \times Y$. Additionally, it is assumed that $f$ is convex in $x$ and concave in $y$. The main contributions of the paper include: 1. **Proposing a generalized optimistic method**, which views the optimistic gradient method as an approximation of the proximal point method and extends this idea to handle constrained saddle point problems with composite objective functions. 2. **Developing a backtracking line search scheme** for selecting step sizes without relying on the knowledge of the smoothness coefficient. 3. **Providing optimistic methods based on different order information** (first-order, second-order, and higher-order optimistic methods) and presenting the best-known global iteration complexity bounds for these methods in both convex-concave and strongly convex-strongly concave settings. Through these methods, the paper not only addresses unconstrained smooth saddle point problems but also extends to constrained problems with composite terms, achieving theoretical results that match the existing optimal upper bounds.

Generalized Optimistic Methods for Convex-Concave Saddle Point Problems

A Proximal-Gradient Method for Constrained Optimization

Efficient First Order Method for Saddle Point Problems with Higher Order Smoothness

Single-loop Projection-free and Projected Gradient-based Algorithms for Nonconvex-concave Saddle Point Problems with Bilevel Structure

Adaptive and Optimal Second-order Optimistic Methods for Minimax Optimization

On Linear Convergence in Smooth Convex-Concave Bilinearly-Coupled Saddle-Point Optimization: Lower Bounds and Optimal Algorithms

A Stochastic Proximal Point Algorithm for Saddle-Point Problems

Efficient Projection-free Algorithms for Saddle Point Problems

Improved Complexity for Smooth Nonconvex Optimization: A Two-Level Online Learning Approach with Quasi-Newton Methods

Escape Saddle Points by a Simple Gradient-Descent Based Algorithm

Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling

A convex combination based primal-dual algorithm with linesearch for general convex-concave saddle point problems

Convergence rate analysis of the gradient descent–ascent method for convex–concave saddle-point problems

Accelerated Primal-Dual Proximal Gradient Splitting Methods for Convex-Concave Saddle-Point Problems

A Proximal Gradient Method with an Explicit Line search for Multiobjective Optimization

A Bregman Proximal Stochastic Gradient Method with Extrapolation for Nonconvex Nonsmooth Problems

Accelerated Primal-dual Scheme for a Class of Stochastic Nonconvex-concave Saddle Point Problems

Generalized Newton Method with Positive Definite Regularization for Nonsmooth Optimization Problems with Nonisolated Solutions

Two efficient gradient methods with approximately optimal stepsizes based on regularization models for unconstrained optimization

Double Momentum Method for Lower-Level Constrained Bilevel Optimization

Convergence and Trade-Offs in Riemannian Gradient Descent and Riemannian Proximal Point