Abstract:Federated learning (FL) enables distribution of machine learning workloads from the cloud to resource-limited edge devices. Unfortunately, current deep networks remain not only too compute-heavy for inference and training on edge devices, but also too large for communicating updates over bandwidth-constrained networks. In this paper, we develop, implement, and experimentally validate a novel FL framework termed Federated Dynamic Sparse Training (FedDST) by which complex neural networks can be deployed and trained with substantially improved efficiency in both on-device computation and in-network communication. At the core of FedDST is a dynamic process that extracts and trains sparse sub-networks from the target full network. With this scheme, "two birds are killed with one stone:" instead of full models, each client performs efficient training of its own sparse networks, and only sparse networks are transmitted between devices and the cloud. Furthermore, our results reveal that the dynamic sparsity during FL training more flexibly accommodates local heterogeneity in FL agents than the fixed, shared sparse masks. Moreover, dynamic sparsity naturally introduces an "in-time self-ensembling effect" into the training dynamics and improves the FL performance even over dense training. In a realistic and challenging non i.i.d. FL setting, FedDST consistently outperforms competing algorithms in our experiments: for instance, at any fixed upload data cap on non-iid CIFAR-10, it gains an impressive accuracy advantage of 10% over FedAvgM when given the same upload data cap; the accuracy gap remains 3% even when FedAvgM is given 2x the upload data cap, further demonstrating efficacy of FedDST. Code is available at: <a class="link-external link-https" href="https://github.com/bibikar/feddst" rel="external noopener nofollow">this https URL</a>.

Can Federated Learning Clients Be Lightweight? A Plug-and-Play Symmetric Conversion Module

FedDGP: Disentangling Global and Personal Models for Federated Learning

Federated Learning with Additional Mechanisms on Clients to Reduce Communication Costs

Federated Learning via Indirect Server-Client Communications

Client Selection for Federated Learning With Non-IID Data in Mobile Edge Computing

FedConv: A Learning-on-Model Paradigm for Heterogeneous Federated Clients

Communication Efficient Federated Learning With Heterogeneous Structured Client Models

FedClust: Optimizing Federated Learning on Non-IID Data through Weight-Driven Client Clustering

Client Contribution Normalization for Enhanced Federated Learning

Towards Client Driven Federated Learning

Leveraging Foundation Models to Improve Lightweight Clients in Federated Learning

Flexible Clustered Federated Learning for Client-Level Data Distribution Shift

Federated Dynamic Sparse Training: Computing Less, Communicating Less, Yet Learning Better

Latency Aware Semi-synchronous Client Selection and Model Aggregation for Wireless Federated Learning

FedPD: A Federated Learning Framework With Adaptivity to Non-IID Data

Mitigating System Bias in Resource Constrained Asynchronous Federated Learning Systems

Federated Learning With Client Selection and Gradient Compression in Heterogeneous Edge Systems

Joint Local Relational Augmentation and Global Nash Equilibrium for Federated Learning with Non-IID Data

FedHC: A Scalable Federated Learning Framework for Heterogeneous and Resource-Constrained Clients

Federated Learning Optimization Algorithm for Automatic Weight Optimal

FedGroup: Efficient Clustered Federated Learning via Decomposed Data-Driven Measure