Convergence analysis of approximate primal solutions in dual first-order methods

Jie Lu,Mikael Johansson
DOI: https://doi.org/10.48550/arXiv.1502.06368
2015-02-23
Abstract:Dual first-order methods are powerful techniques for large-scale convex optimization. Although an extensive research effort has been devoted to studying their convergence properties, explicit convergence rates for the primal iterates have only been established under global Lipschitz continuity of the dual gradient. This is a rather restrictive assumption that does not hold for several important classes of problems. In this paper, we demonstrate that primal convergence rate guarantees can also be obtained when the dual gradient is only locally Lipschitz. The class of problems that we analyze admits general convex constraints including nonlinear inequality, linear equality, and set constraints. As an approximate primal solution, we take the minimizer of the Lagrangian, computed when evaluating the dual gradient. We derive error bounds for this approximate primal solution in terms of the errors of the dual variables, and establish convergence rates of the dual variables when the dual problem is solved using a projected gradient or fast gradient method. By combining these results, we show that the suboptimality and infeasibility of the approximate primal solution at iteration $k$ are no worse than $O(1/\sqrt{k})$ when the dual problem is solved using a projected gradient method, and $O(1/k)$ when a fast dual gradient method is used.
Optimization and Control
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is about the convergence analysis of approximate primal solutions when dual - first - order methods are used to solve large - scale convex optimization problems. Specifically, the paper focuses on how to ensure the convergence rate of approximate primal solutions when the dual gradient is only locally Lipschitz continuous. Traditional research usually assumes that the dual gradient is globally Lipschitz continuous, which limits its scope of application. By relaxing this assumption, this paper extends the types of problems to which these methods can be applied, especially general convex optimization problems that include nonlinear inequality constraints, linear equality constraints and set constraints. ### Background and Problem Description of the Paper In large - scale optimization problems, Lagrangian duality is a commonly used method, especially when dealing with a few constraints that complicate the problem. Although many first - order methods can be directly applied to the primal space to solve the problem, the computational cost of projecting onto the constraint set can be very high. In contrast, the dual problem has a more desirable structure: the dual constraint set has a simple form, and the (sub) - gradient of the dual function is relatively easy to calculate. Moreover, the dual function is usually additive and suitable for distributed implementation. However, there are some practical and theoretical challenges in using dual optimization methods to generate optimal solutions for engineering problems: 1. **Consistency between Dual and Primal Optimal Values**: It is necessary to ensure that the dual optimal value is consistent with the primal optimal value, that is, there is no duality gap. For convex optimization problems, this can be achieved by verifying Slater's constraint conditions. 2. **Convergence of the Dual Iteration Sequence**: It is necessary to ensure that the iteration sequence generated by the dual optimization method converges to the dual optimal solution, which is not always true. 3. **Construction of Approximate Primal Solutions**: Constructing approximate primal solutions (representing actual decisions) from dual iterations is a requirement in most applications. Whether the approximate primal solution converges to the primal optimal solution is an important practical problem. 4. **Estimation of the Convergence Rate**: In order to evaluate the time of the solution and understand how it depends on the problem data, it is necessary to estimate the convergence rate of the solution. ### Contributions of the Paper The main contributions of this paper are as follows: - **Relaxing the Assumption of the Dual Gradient**: This paper considers a broader class of convex optimization problems in which the dual gradient is only locally Lipschitz continuous rather than globally Lipschitz continuous. - **Error Analysis of Approximate Primal Solutions**: This paper establishes the convergence of approximate primal solutions by analyzing the error of approximate primal solutions and relating it to the error of dual variables. - **Derivation of the Convergence Rate**: This paper derives the convergence rate of approximate primal solutions when using the classical projected gradient method and the fast gradient method to solve the dual problem. Specifically, when using the projected gradient method, the convergence rate of the approximate primal solution in terms of optimality and feasibility is \(O(1 / \sqrt{k})\), while when using the fast gradient method, the convergence rate is \(O(1 / k)\). ### Mathematical Expressions - **Dual Function**: The dual function \(d(u)\) is defined as: \[ d(u)=\min_{x \in X} L(x, u) \] where \(L(x, u)=f(x)+\sum_{i = 1}^m u_i g_i(x)+(u_{m + 1:m + p})^T(Ax + b)\) is the Lagrangian function. - **Dual Gradient**: The gradient of the dual function is: \[ \nabla d(u)=[g_1(\bar{x}(u)), \ldots, g_m(\bar{x}(u)),(A\bar{x}(u)+b)^T]^T \] - **Approximate Primal Solution**: Given the dual variable \(u\), the approximate primal solution \(\bar{x}(u)\) is the minimizing point of the Lagrangian function \(L(x, u)\): \[ \bar{x}(u) \in \arg \min_{x \in X} L(x, u) \] - **Error Bound**: This paper derives the error bounds of the approximate primal solution in terms of optimality and feasibility, for example: \[ \|\bar{x}(u)-x^\star\| \leq \gamma(u^\star)\|u - u^\star\|