Block Coordinate Descent Methods for Structured Nonconvex Optimization with Nonseparable Constraints: Optimality Conditions and Global Convergence

Zhijie Yuan,Ganzhao Yuan,Lei Sun
2024-12-08
Abstract:Coordinate descent algorithms are widely used in machine learning and large-scale data analysis due to their strong optimality guarantees and impressive empirical performance in solving non-convex problems. In this work, we introduce Block Coordinate Descent (BCD) method for structured nonconvex optimization with nonseparable constraints. Unlike traditional large-scale Coordinate Descent (CD) approaches, we do not assume the constraints are separable. Instead, we account for the possibility of nonlinear coupling among them. By leveraging the inherent problem structure, we propose new CD methods to tackle this specific challenge. Under the relatively mild condition of locally bounded non-convexity, we demonstrate that achieving coordinate-wise stationary points offer a stronger optimality criterion compared to standard critical points. Furthermore, under the Luo-Tseng error bound conditions, our BCD methods exhibit Q-linear convergence to coordinate-wise stationary points or critical points. To demonstrate the practical utility of our methods, we apply them to various machine learning and signal processing models. We also provide the geometry analysis for the models. Experiments on real-world data consistently demonstrate the superior objective values of our approaches compared to existing methods.
Optimization and Control
What problem does this paper attempt to address?
This paper attempts to address two major challenges in non - convex optimization problems: non - convexity and non - separable constraints. Specifically, the author introduced a new Block Coordinate Descent (BCD) method to handle structured non - convex optimization problems with non - separable constraints. Unlike traditional large - scale coordinate descent methods, this method does not assume that the constraints are separable, but takes into account the nonlinear coupling between the constraints. ### Core Problems of the Paper 1. **Non - convexity**: Non - convex optimization problems are a crucial part of training models in machine learning because non - convex performance can more accurately capture complex prediction problems. However, due to their NP - hard nature, these problems are notoriously difficult to solve. 2. **Non - separable constraints**: Traditional methods usually assume that the constraints are separable, which simplifies the problem - solving process. But in practical applications, the constraints of many problems are non - separable, and a new method is required to handle this situation. ### Specific Objectives - Propose a BCD algorithm applicable to non - convex composite optimization problems and non - separable constraints to provide better solutions than existing methods. - Theoretically prove the optimality of the proposed method and show that the stationary point in the coordinate direction is also a critical point. - For the first time, study the convergence rate of such problems and establish a Q - linear convergence rate. - Provide geometric analysis and experimentally verify the superior performance of the new method on real - data. ### Main Contributions 1. **Propose a new BCD algorithm**: For non - convex optimization problems with non - separable constraints, two new BCD methods are proposed. 2. **Theoretical analysis**: Prove that the stationary point in the coordinate direction is also a critical point and study the convergence rate of such problems for the first time. 3. **Acceleration strategy**: Introduce a breakpoint search strategy and two semi - greedy index selection strategies to accelerate the BCD method and improve computational efficiency. 4. **Geometric analysis**: Provide geometric analysis for three application scenarios. 5. **Experimental verification**: Through extensive experiments, show that the new method is superior to the existing full - gradient algorithms. ### Application Examples The paper discusses four specific examples of optimization frameworks: 1. **Sparse Index Tracking (SIT)**: Used for asset selection and capital allocation. 2. **Non - negative Sparse PCA (NNSPCA)**: Extends the traditional PCA by adding non - negativity and sparsity constraints. 3. **DC Penalized Binary Optimization (DCPB)**: Used to handle optimization problems with binary structures. 4. **Other variants of binary optimization problems**: Transform binary constraints into problems in the continuous domain through different variational restatement methods. Through these examples, the paper demonstrates the wide applicability and effectiveness of the proposed BCD method in practical applications.