Abstract:Distributed stochastic non-convex optimization problems have recently received attention due to the growing interest of signal processing, computer vision, and natural language processing communities in applications deployed over distributed learning systems (e.g., federated learning). We study the setting where the data is distributed across the nodes of a time-varying directed network, a topology suitable for modeling dynamic networks experiencing communication delays and straggler effects. The network nodes, which can access only their local objectives and query a stochastic first-order oracle to obtain gradient estimates, collaborate to minimize a global objective function by exchanging messages with their neighbors. We propose an algorithm, novel to this setting, that leverages stochastic gradient descent with momentum and gradient tracking to solve distributed non-convex optimization problems over time-varying networks. To analyze the algorithm, we tackle the challenges that arise when analyzing dynamic network systems which communicate gradient acceleration components. We prove that the algorithm's oracle complexity is $\mathcal{O}(1/\epsilon^{1.5})$, and that under Polyak-$Ł$ojasiewicz condition the algorithm converges linearly to a steady error state. The proposed scheme is tested on several learning tasks: a non-convex logistic regression experiment on the MNIST dataset, an image classification task on the CIFAR-10 dataset, and an NLP classification test on the IMDB dataset. We further present numerical simulations with an objective that satisfies the PL condition. The results demonstrate superior performance of the proposed framework compared to the existing related methods.

Two-timescale projection neural networks in collaborative neurodynamic approaches to global optimization and distributed optimization

A collaborative neurodynamic approach with two-timescale projection neural networks designed via majorization-minimization for global optimization and distributed global optimization

Two-timescale recurrent neural networks for distributed minimax optimization

Accelerated Primal-Dual Projection Neurodynamic Approach With Time Scaling for Linear and Set Constrained Convex Optimization Problems

A Two-Timescale Duplex Neurodynamic Approach to Mixed-Integer Optimization

Neurodynamic approaches for multi-agent distributed optimization

Cardinality-constrained portfolio selection via two-timescale duplex neurodynamic optimization

A Novel Swarm-Exploring Neurodynamic Network for Obtaining Global Optimal Solutions to Nonconvex Nonlinear Programming Problems

A discrete-time neural network for optimization problems with hybrid constraints.

A Collaborative Neurodynamic Optimization Approach to Distributed Nash-Equilibrium Seeking in Multicluster Games With Nonconvex Functions

Accelerated Distributed Stochastic Non-Convex Optimization over Time-Varying Directed Networks

A nonautonomous-differential-inclusion neurodynamic approach for nonsmooth distributed optimization on multi-agent systems

Solving Pseudomonotone Variational Inequalities and Pseudoconvex Optimization Problems Using the Projection Neural Network.

Two-Timescale Joint Optimization of Task Scheduling and Resource Scaling in Multi-Data Center System Based on Multi-Agent Deep Reinforcement Learning

Two-timescale neurodynamic approaches to supervised feature selection based on alternative problem formulations

Parallel Solution of Nonlinear Projection Equations in a Multitask Learning Framework

Design and Analysis of a Novel Distributed Gradient Neural Network for Solving Consensus Problems in a Predefined Time

Distributed Continuous‐time Constrained Convex Optimization with General Time‐varying Cost Functions

Distributed Optimization Algorithm with Superlinear Convergence Rate

Enhancing neurodynamic approach with physics-informed neural networks for solving non-smooth convex optimization problems

A Fixed-Time Proximal Gradient Neurodynamic Network With Time-Varying Coefficients for Composite Optimization Problems and Sparse Optimization Problems With Log-Sum Function