Decentralized Sum-of-Nonconvex Optimization

Zhuanghua Liu,Bryan Kian Hsiang Low
2024-02-04
Abstract:We consider the optimization problem of minimizing the sum-of-nonconvex function, i.e., a convex function that is the average of nonconvex components. The existing stochastic algorithms for such a problem only focus on a single machine and the centralized scenario. In this paper, we study the sum-of-nonconvex optimization in the decentralized setting. We present a new theoretical analysis of the PMGT-SVRG algorithm for this problem and prove the linear convergence of their approach. However, the convergence rate of the PMGT-SVRG algorithm has a linear dependency on the condition number, which is undesirable for the ill-conditioned problem. To remedy this issue, we propose an accelerated stochastic decentralized first-order algorithm by incorporating the techniques of acceleration, gradient tracking, and multi-consensus mixing into the SVRG algorithm. The convergence rate of the proposed method has a square-root dependency on the condition number. The numerical experiments validate the theoretical guarantee of our proposed algorithms on both synthetic and real-world datasets.
Optimization and Control,Machine Learning
What problem does this paper attempt to address?
This paper aims to address the problem of minimizing the sum of non-convex functions in a decentralized setting. Specifically, existing stochastic algorithms only focus on centralized scenarios in a single-machine environment, while this paper investigates the optimization problem of the sum of non-convex functions in a decentralized environment and proposes a new theoretical analysis framework to evaluate the linear convergence of the PMGT-SVRG algorithm. However, the convergence rate of the PMGT-SVRG algorithm has a linear dependency on the condition number, which is not ideal for ill-conditioned problems. To improve this issue, the authors propose a stochastic decentralized first-order algorithm that combines acceleration techniques, gradient tracking, and a multi-consensus mixing strategy. The dependency of the convergence rate on the condition number is reduced from linear to the square root level. Experimental results show that the proposed algorithm validates its theoretical guarantees on both synthetic and real datasets. Overall, this paper aims to design an optimization algorithm that is superior in terms of communication and computational efficiency to solve the optimization problem of the sum of non-convex functions in a decentralized environment.