Abstract:Most existing wireless federated learning (FL) studies focused on homogeneous model settings where devices train identical local models. In this setting, the devices with poor communication and computation capabilities may delay the global model update and degrade the performance of FL. Moreover, in the homogenous model settings, the scale of the global model is restricted by the device with the lowest capability. To tackle these challenges, this work proposes an adaptive model pruning-based FL (AMP-FL) framework, where the edge server dynamically generates sub-models by pruning the global model for devices’ local training to adapt their heterogeneous computation capabilities and time-varying channel conditions. Since the involvement of diverse structures of devices’ sub-models in the global model updating may negatively affect the training convergence, we propose compensating for the gradients of pruned model regions by devices’ historical gradients. We then introduce an age of information (AoI) metric to characterize the staleness of local gradients and theoretically analyze the convergence behaviour of AMP-FL. The convergence bound suggests scheduling devices with large AoI of gradients and pruning the model regions with small AoI for devices to improve the learning performance. Inspired by this, we define a new objective function, i.e., the average AoI of local gradients, to transform the inexplicit global loss minimization problem into a tractable one for device scheduling, model pruning, and resource block (RB) allocation design. Through detailed analysis, we derive the optimal model pruning strategy and transform the RB allocation problem into equivalent linear programming that can be effectively solved. Experimental results demonstrate the effectiveness and superiority of the proposed approaches. The proposed AMP-FL is capable of achieving 1.9x and 1.6x speed up for FL on MNIST and CIFAR-10 datasets in comparison with the FL schemes with homogeneous model settings.

Model Pruning for Distributed Learning over the Air

Pruning Analog Over-the-Air Distributed Learning Models with Accuracy Loss Guarantee

Analog Gradient Aggregation for Federated Learning Over Wireless Networks: Customized Design and Convergence Analysis

Over-the-air Learning Rate Optimization for Federated Learning

Joint Model Pruning and Resource Allocation for Wireless Time-triggered Federated Learning

Over-the-Air Federated Learning with Joint Adaptive Computation and Power Control

Mixed-Precision Federated Learning via Multi-Precision Over-The-Air Aggregation

Adaptive Model Pruning for Communication and Computation Efficient Wireless Federated Learning

Learning by Over-the-Air Training: Distributed Precoding for Cell-Free Massive MIMO

Joint Model Pruning and Device Selection for Communication-Efficient Federated Edge Learning

Revisiting Analog Over-the-Air Machine Learning: The Blessing and Curse of Interference

One-Bit Byzantine-Tolerant Distributed Learning via Over-the-Air Computation

Approximate to Be Great: Communication Efficient and Privacy-Preserving Large-Scale Distributed Deep Learning in Internet of Things

Over-the-Air Federated Learning and Optimization

Over-The-Air Federated Learning: Status Quo, Open Challenges, and Future Directions

Adaptive Model Pruning and Personalization for Federated Learning Over Wireless Networks

Federated Learning via Over-the-Air Computation

Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch

Model Aggregation Method for Data Parallelism in Distributed Real-Time Machine Learning of Smart Sensing Equipment

Age-Based Device Selection and Transmit Power Optimization in Over-the-Air Federated Learning

Distillation Sparsity Training Algorithm for Accelerating Convolutional Neural Networks in Embedded Systems