Near-optimal tensor methods for minimizing the gradient norm of convex functions and accelerated primal–dual tensor methods
Pavel Dvurechensky,Petr Ostroukhov,Alexander Gasnikov,César A. Uribe,Anastasiya Ivanova
DOI: https://doi.org/10.1080/10556788.2023.2296443
2024-02-07
Optimization Methods and Software
Abstract:Motivated, in particular, by the entropy-regularized optimal transport problem, we consider convex optimization problems with linear equality constraints, where the dual objective has Lipschitz p th order derivatives, and develop two approaches for solving such problems. The first approach is based on the minimization of the norm of the gradient in the dual problem and then the reconstruction of an approximate primal solution. Recently, Grapiglia and Nesterov [ Tensor methods for finding approximate stationary points of convex functions , Optim. Methods Softw. (2020), pp. 1–34] showed lower complexity bounds for the problem of minimizing the gradient norm of the function with Lipschitz p th order derivatives. Still, the question of optimal or near-optimal methods remained open as the algorithms presented in [Grapiglia and Nesterov, Tensor methods for finding approximate stationary points of convex functions , Optim. Methods Softw. (2020), pp. 1–34] achieve suboptimal bounds only. We close this gap by proposing two near-optimal (up to logarithmic factors) methods with complexity bounds O~(ε−2(p+1)/(3p+1)) and O~(ε−2/(3p+1)) with respect to the initial objective residual and the distance between the starting point and solution, respectively. We then apply these results (having independent interest) to our primal–dual setting. As the second approach, we propose a direct accelerated primal–dual tensor method for convex problems with linear equality constraints, where the dual objective has Lipschitz p th order derivatives. For this algorithm, we prove O~(ε−1/(p+1)) complexity in terms of the duality gap and the residual in the constraints. We illustrate the practical performance of the proposed algorithms in experiments on logistic regression, entropy-regularized optimal transport problem, and the minimal mutual information problem.
operations research & management science,mathematics, applied,computer science, software engineering