Breaking the $O(1/\epsilon)$ Optimal Rate for a Class of Minimax Problems

Song Chaobing,Jiang Yong,Ma Yi
DOI: https://doi.org/10.48550/arxiv.2003.11758
2020-01-01
Abstract: It is known that for convex optimization $\min_{\mathbf{w}\in\mathcal{W}}f(\mathbf{w})$, the best possible rate of first order accelerated methods is $O(1/\sqrt{\epsilon})$. However, for the bilinear minimax problem: $\min_{\mathbf{w}\in\mathcal{W}}\max_{\mathbf{v}\in\mathcal{V}}$ $f(\mathbf{w})$ $+\langle\mathbf{w}, \boldsymbol{A}\mathbf{v}\rangle$ $-h(\mathbf{v})$ where both $f(\mathbf{w})$ and $h(\mathbf{v})$ are convex, the best known rate of first order methods slows down to $O(1/{\epsilon})$. It is not known whether one can achieve the accelerated rate $O(1/\sqrt{\epsilon})$ for the bilinear minimax problem without assuming $f(\mathbf{w})$ and $h(\mathbf{v})$ being strongly convex. In this paper, we fill this theoretical gap by proposing a bilinear accelerated extragradient (BAXG) method. We show that when $\mathcal{W}=\mathbb{R}^d$, $f(\mathbf{w})$ and $h(\mathbf{v})$ are convex and smooth, and $\boldsymbol{A}$ has full column rank, then the BAXG method achieves an accelerated rate $O(1/\sqrt{\epsilon}\log \frac{1}{\epsilon})$, within a logarithmic factor to the likely optimal rate $O(1/\sqrt{\epsilon})$. As result, a large class of bilinear convex concave minimax problems, including a few problems of practical importance, can be solved much faster than previously known methods.
What problem does this paper attempt to address?