A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions

Yair Carmon,Arun Jambulapati,Yujia Jin,Aaron Sidford
2023-11-18
Abstract:We design algorithms for minimizing $\max_{i\in[n]} f_i(x)$ over a $d$-dimensional Euclidean or simplex domain. When each $f_i$ is $1$-Lipschitz and $1$-smooth, our method computes an $\epsilon$-approximate solution using $\widetilde{O}(n \epsilon^{-1/3} + \epsilon^{-2})$ gradient and function evaluations, and $\widetilde{O}(n \epsilon^{-4/3})$ additional runtime. For large $n$, our evaluation complexity is optimal up to polylogarithmic factors. In the special case where each $f_i$ is linear -- which corresponds to finding a near-optimal primal strategy in a matrix game -- our method finds an $\epsilon$-approximate solution in runtime $\widetilde{O}(n (d/\epsilon)^{2/3} + nd + d\epsilon^{-2})$. For $n>d$ and $\epsilon=1/\sqrt{n}$ this improves over all existing first-order methods. When additionally $d = \omega(n^{8/11})$ our runtime also improves over all known interior point methods. Our algorithm combines three novel primitives: (1) A dynamic data structure which enables efficient stochastic gradient estimation in small $\ell_2$ or $\ell_1$ balls. (2) A mirror descent algorithm tailored to our data structure implementing an oracle which minimizes the objective over these balls. (3) A simple ball oracle acceleration framework suitable for non-Euclidean geometry.
Data Structures and Algorithms,Machine Learning,Optimization and Control
What problem does this paper attempt to address?
The paper is primarily dedicated to addressing a specific class of optimization problems, namely the minimization of the maximum of functions, formally represented as \(\min_{x \in X} \max_{i \in [n]} f_i(x)\), where \(f_i\) are convex functions, and \(X\) is a closed convex set. Specifically, the paper focuses on two settings: 1. **Euclidean or Simplex Domain**: \(X\) can be a subset of a \(d\)-dimensional Euclidean ball (using the Euclidean norm), or an \(n\)-dimensional probability simplex (using the \(l_1\) norm). 2. **Linear Case**: When each \(f_i(x) = a_i^{\top}x\), this problem can be viewed as finding the optimal strategy for one side in a matrix game, which also applies to linear programming problems. The goal of the paper is to design efficient algorithms to solve the above problem under these settings and provide complexity analysis under different conditions. In particular, the paper proposes a new primal accelerated method that can find an \(\epsilon\)-approximate solution with fewer gradient and function evaluations. Specifically, the time complexity of the algorithm in each case is detailed in relation to \(n\) and \(\epsilon\). To achieve this goal, the authors developed three novel technical components: 1. **Dynamic Data Structure**: This data structure enables efficient estimation of stochastic gradients within small \(l_2\) or \(l_1\) balls. 2. **Mirror Descent Algorithm**: This algorithm is tailored to the aforementioned data structure, implementing an operator that minimizes the objective function within these balls. 3. **Simple Ball Operator Acceleration Framework**: Suitable for non-Euclidean geometry, allowing acceleration in non-Euclidean spaces. By combining these techniques, the methods in the paper achieve optimal evaluation complexity for large \(n\), and for certain parameter ranges, their runtime outperforms all existing first-order methods and known interior-point methods. Overall, the algorithms and techniques proposed in this paper provide a new and effective approach to solving this class of optimization problems, especially when dealing with large-scale datasets and high-dimensional spaces.