Variance-reduced Reshuffling Gradient Descent for Nonconvex Optimization: Centralized and Distributed Algorithms

Xia Jiang,Xianlin Zeng,Lihua Xie,Jian Sun,Jie Chen
DOI: https://doi.org/10.1016/j.automatica.2024.111954
2025-01-01
Abstract:Nonconvex finite-sum optimization plays a crucial role in signal processing and machine learning, fueling the development of numerous centralized and distributed stochastic algorithms. However, existing stochastic optimization algorithms often suffer from high stochastic gradient variance due to the use of random sampling with replacement. To address this issue, this paper introduces an explicit variance-reduction step and proposes variance-reduced reshuffling gradient algorithms with a sampling-without-replacement scheme. Specifically, this paper proves that the proposed centralized variance-reduced reshuffling gradient algorithm (VR-RG) with constant step sizes converges to a stationary point for nonconvex optimization under the Kurdyka-& Lstrok;ojasiewicz condition. Moreover, for nonconvex optimization over connected multi-agent networks, the proposed distributed variance- reduced reshuffling gradient algorithm (DVR-RG) converges to a neighborhood of stationary points, where the neighborhood can be made arbitrarily small under mild conditions. Notably, the proposed DVR-RG requires only one communication round at each epoch. Finally, numerical simulations demonstrate the efficiency of the proposed algorithms. (c) 2024 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
What problem does this paper attempt to address?