Abstract:Algorithm unrolling has emerged as a learning-based optimization paradigm that unfolds truncated iterative algorithms in trainable neural-network optimizers. We introduce Stochastic UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning in order to expedite its convergence. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolled optimizers to find a descent direction and the decentralized nature of federated learning. We circumvent the former challenge by feeding stochastic mini-batches to each unrolled layer and imposing descent constraints to guarantee its convergence. We address the latter challenge by unfolding the distributed gradient descent (DGD) algorithm in a graph neural network (GNN)-based unrolled architecture, which preserves the decentralized nature of training in federated learning. We theoretically prove that our proposed unrolled optimizer converges to a near-optimal region infinitely often. Through extensive numerical experiments, we also demonstrate the effectiveness of the proposed framework in collaborative training of image classifiers.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address two main challenges in Federated Learning (FL): accelerating convergence speed and maintaining communication efficiency in a decentralized environment. #### Specific Problems: 1. **Accelerating Convergence**: Traditional federated learning methods usually require a large number of iterations to achieve good performance, which is a practical issue for resource-constrained and energy-constrained devices. The paper proposes a method based on algorithm unfolding to accelerate the convergence speed. 2. **Decentralized Communication Efficiency**: Federated learning typically involves a central server to coordinate various nodes, which can lead to communication bottlenecks. While decentralized federated learning frameworks can alleviate this issue, their convergence speed is slower. The paper introduces Graph Neural Networks (GNN) to unfold Decentralized Gradient Descent (DGD), thereby maintaining efficient communication in a decentralized environment. #### Method Overview: - **Stochastic UnRolled Federated Learning (SURF)**: This is a new training method that accelerates the convergence speed of federated learning by unfolding iterative algorithms into trainable neural network layers. Specifically, it uses stochastic mini-batch data to update each unfolded layer and imposes descent constraints to ensure convergence. - **Unfolded Decentralized Gradient Descent (U-DGD)**: Utilizes Graph Neural Networks to unfold the DGD algorithm, enabling it to handle decentralized federated learning problems and extend to classical federated learning scenarios. #### Main Contributions: - Proposes an algorithm unfolding-based federated learning method (SURF) that can be trained on limited datasets while ensuring convergence. - Unfolds the DGD algorithm (U-DGD) based on Graph Neural Networks, making it suitable for both decentralized and classical federated learning problems. - Theoretically proves that the unfolded network trained by SURF can converge to an approximately optimal solution at an exponential rate. - Experimentally validates the effectiveness of the proposed method on image classification tasks. Through these methods, the paper addresses key challenges in federated learning and proposes new methods with faster convergence speeds.

Stochastic Unrolled Federated Learning

Robust Stochastically-Descending Unrolled Networks

FedEmb: A Vertical and Hybrid Federated Learning Algorithm using Network And Feature Embedding Aggregation

Federated Optimization with Doubly Regularized Drift Correction

FedBnR: Mitigating federated learning Non-IID problem by breaking the skewed task and reconstructing representation

Decentralized Statistical Inference with Unrolled Graph Neural Networks

Stochastic Approximation Approach to Federated Machine Learning

Locally Adaptive Federated Learning

Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup

Backpropagation of Unrolled Solvers with Folded Optimization

Federated Frank-Wolfe Algorithm

Riemannian Federated Learning via Averaging Gradient Stream

A Derivative-Incorporated Adaptive Gradient Method for Federated Learning

Federated unsupervised representation learning

Unlocking FedNL: Self-Contained Compute-Optimized Implementation

Analyzing and Enhancing the Backward-Pass Convergence of Unrolled Optimization

Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization

Accelerated Federated Learning with Decoupled Adaptive Optimization

Preconditioned Federated Learning

Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis

Parallel Successive Learning for Dynamic Distributed Model Training over Heterogeneous Wireless Networks