Abstract:Neural network pruning is an essential technique for reducing the size and complexity of deep neural networks, enabling large-scale models on devices with limited resources. However, existing pruning approaches heavily rely on training data for guiding the pruning strategies, making them ineffective for federated learning over distributed and confidential datasets. Additionally, the memory- and computation-intensive pruning process becomes infeasible for recourse-constrained devices in federated learning. To address these challenges, we propose FedTiny, a distributed pruning framework for federated learning that generates specialized tiny models for memory- and computing-constrained devices. We introduce two key modules in FedTiny to adaptively search coarse- and finer-pruned specialized models to fit deployment scenarios with sparse and cheap local computation. First, an adaptive batch normalization selection module is designed to mitigate biases in pruning caused by the heterogeneity of local data. Second, a lightweight progressive pruning module aims to finer prune the models under strict memory and computational budgets, allowing the pruning policy for each layer to be gradually determined rather than evaluating the overall model structure. The experimental results demonstrate the effectiveness of FedTiny, which outperforms state-of-the-art approaches, particularly when compressing deep models to extremely sparse tiny models. FedTiny achieves an accuracy improvement of 2.61% while significantly reducing the computational cost by 95.91% and the memory footprint by 94.01% compared to state-of-the-art methods.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to generate specialized tiny neural networks on distributed and confidential datasets in the federated learning environment to adapt to the memory and computing power of resource - constrained devices. Specifically: 1. **Existing pruning methods rely on training data**: Traditional neural network pruning methods rely heavily on training data to guide the pruning strategy, which makes them ineffective when dealing with confidential data distributed across multiple devices. 2. **Challenges of resource - constrained devices**: The pruning process itself is memory - and compute - intensive, which is not feasible for resource - constrained devices. 3. **Non - iid data problem**: The data distribution on different devices may be inconsistent, resulting in biased pruning results on the server side. To solve these problems, the authors propose FedTiny, a distributed pruning framework for federated learning, aiming to generate specialized tiny neural networks for resource - constrained devices. FedTiny introduces two key modules: 1. **Adaptive Batch Normalization Selection Module**: - Through an indirect pruning method, it evaluates the server - side pruning results on the device, thereby identifying a specialized coarse - pruning model. - The device only evaluates the server - side pruning and feeds back the batch normalization parameters to the server to reduce the computing and communication costs. 2. **Lightweight Progressive Pruning Module**: - It gradually adjusts the model structure and evaluates only a part of the parameters (for example, a single layer) each time, thereby significantly reducing the memory, computing, and communication costs. - By iteratively growing and pruning parameters, the model structure gradually approaches the optimal structure. Through these two modules, FedTiny can effectively perform pruning on resource - constrained devices and generate efficient tiny neural networks while maintaining high accuracy and low computing costs. ### Formula Summary - Batch normalization transformation formula: \[ \hat{x}_i=\frac{x_i - \mu}{\sqrt{\sigma^2+\epsilon}} \] where \(\mu\) and \(\sigma\) are the mean and standard deviation respectively, and \(\epsilon\) is a small constant. - Batch normalization parameter update formula: \[ \mu_t = \gamma\mu_{t - 1}+(1 - \gamma)\mu_i,\quad\sigma_t^2=\gamma\sigma_{t - 1}^2+(1 - \gamma)\sigma_i^2 \] where \(\gamma\) is the momentum coefficient and \(t\) is the number of training iterations. - Global batch normalization parameter aggregation formula in the Adaptive Batch Normalization Selection Module: \[ \mu^{(c)}=\frac{\sum_{k = 1}^K|D_k|\mu_k^{(c)}}{\sum_{k = 1}^K|D_k|},\quad\sigma^{(c)}=\frac{\sum_{k = 1}^K|D_k|\sigma_k^{(c)}}{\sum_{k = 1}^K|D_k|} \] where \(|D_k|\) represents the number of samples of the \(k\)-th device. - Gradient calculation formula in the Progressive Pruning Module: \[ \tilde{g}_{k,l}^t=\text{TopK}(g_{k,l}^t,a_l^t) \] where \(\text{TopK}(v,k)\) is a threshold function that replaces elements with absolute values less than the \(k\)-th largest absolute value with 0. Through these methods, FedTiny achieves efficient and low - resource - consumption neural network pruning in the federated learning environment.

Distributed Pruning Towards Tiny Neural Networks in Federated Learning

Class-Aware Pruning for Efficient Neural Networks

Towards Sparsified Federated Neuroimaging Models via Weight Pruning

FedMef: Towards Memory-efficient Federated Dynamic Pruning

Personalized Federated Learning Incorporating Adaptive Model Pruning at the Edge

Efficient federated learning on resource-constrained edge devices based on model pruning

Model Pruning-enabled Federated Split Learning for Resource-constrained Devices in Artificial Intelligence Empowered Edge Computing Environment

Computation and Communication Efficient Federated Learning With Adaptive Model Pruning

Efficient Federated Learning with Adaptive Channel Pruning for Edge Devices

A Dynamic Pruning Method on Multiple Sparse Structures in Deep Neural Networks

A lightweight and personalized edge federated learning model

When Foresight Pruning Meets Zeroth-Order Optimization: Efficient Federated Learning for Low-Memory Devices

Cloud–Edge Collaborative Inference with Network Pruning

FedPrune: Personalized and Communication-Efficient Federated Learning on Non-IID Data

Adaptive Model Pruning and Personalization for Federated Learning Over Wireless Networks

Structured Pruning Learns Compact and Accurate Models

Federated Split Learning with Model Pruning and Gradient Quantization in Wireless Networks

FedLP: Layer-wise Pruning Mechanism for Communication-Computation Efficient Federated Learning

Accelerating Federated Learning for IoT in Big Data Analytics With Pruning, Quantization and Selective Updating

FedPAGE: Pruning Adaptively Toward Global Efficiency of Heterogeneous Federated Learning

Federated Dropout -- A Simple Approach for Enabling Federated Learning on Resource Constrained Devices