When Foresight Pruning Meets Zeroth-Order Optimization: Efficient Federated Learning for Low-Memory Devices

Pengyu Zhang,Yingjie Liu,Yingbo Zhou,Xiao Du,Xian Wei,Ting Wang,Mingsong Chen
2024-05-08
Abstract:Although Federated Learning (FL) enables collaborative learning in Artificial Intelligence of Things (AIoT) design, it fails to work on low-memory AIoT devices due to its heavy memory usage. To address this problem, various federated pruning methods are proposed to reduce memory usage during inference. However, few of them can substantially mitigate the memory burdens during pruning and training. As an alternative, zeroth-order or backpropagation-free (BP-Free) methods can partially alleviate the memory consumption, but they suffer from scaling up and large computation overheads, since the gradient estimation error and floating point operations (FLOPs) increase as the dimensionality of the model parameters grows. In this paper, we propose a federated foresight pruning method based on Neural Tangent Kernel (NTK), which can seamlessly integrate with federated BP-Free training frameworks. We present an approximation to the computation of federated NTK by using the local NTK matrices. Moreover, we demonstrate that the data-free property of our method can substantially reduce the approximation error in extreme data heterogeneity scenarios. Since our approach improves the performance of the vanilla BP-Free method with fewer FLOPs and truly alleviates memory pressure during training and inference, it makes FL more friendly to low-memory devices. Comprehensive experimental results obtained from simulation- and real test-bed-based platforms show that our federated foresight-pruning method not only preserves the ability of the dense model with a memory reduction up to 9x but also boosts the performance of the vanilla BP-Free method with dramatically fewer FLOPs.
Machine Learning,Artificial Intelligence,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the application of Federated Learning (FL) on low - memory AIoT devices. Specifically, the existing Federated Learning methods cannot run effectively on AIoT devices with limited memory resources due to their large memory consumption. To solve this problem, the author proposes a forward - looking pruning method based on the Neural Tangent Kernel (NTK) and combines it with Zeroth - Order Optimization to achieve efficient Federated Learning on low - memory devices. ### Main problems 1. **Large memory consumption**: The existing Federated Learning methods require a large amount of memory resources during the training and inference processes, which makes them unable to run effectively on memory - constrained AIoT devices. 2. **Limitations of existing pruning methods**: Although there are already some Federated pruning methods that can reduce the memory usage during inference, there is still a large memory burden during the pruning and training processes. 3. **Challenges of zeroth - order optimization methods**: Zeroth - order optimization or Back - Propagation - Free (BP - Free) methods can reduce memory consumption to a certain extent, but as the dimension of model parameters increases, the gradient estimation error and floating - point operations (FLOPs) also increase, resulting in excessive computational overhead. ### Solutions The author proposes a forward - looking pruning method based on NTK and combines it with the BP - Free training framework, specifically including the following aspects: 1. **Forward - looking pruning**: Through the local approximation of the NTK matrix, the author proposes a data - free forward - looking pruning method, which can significantly reduce the approximation error in extreme data heterogeneity scenarios. 2. **Zeroth - order optimization**: Use Stein's Identity for gradient estimation, avoiding the large amount of memory required for back - propagation, thereby reducing the computational overhead. 3. **Sparse structure utilization**: By introducing a sparse structure, the local computational overhead is reduced, and the performance of Stein's Identity in the FL setting is improved. ### Main contributions 1. Propose a new memory - efficient forward - looking pruning method that can handle various data heterogeneities. 2. Propose an approximation method for the Federated NTK matrix and show that the data - independent property of this method can effectively reduce the approximation error. 3. Combine the proposed forward - looking pruning method with BP - Free training, and conduct comprehensive experiments on simulation and real - world test platforms to prove the effectiveness of this method. ### Experimental results Through experiments on the CIFAR - 10 and CIFAR - 100 datasets, the author shows the advantages of this method in terms of accuracy and computational efficiency. Specifically: - Under the extremely non - IID data distribution, the accuracy drop of the NTK method is only 0.42%, which is much lower than other methods. - Compared with FedDST, the FLOPs of the NTK method in a single forward pass are slightly higher, but due to avoiding the additional computational overhead caused by sparse structure adjustment, the overall training is more stable and efficient. In conclusion, this paper proposes an efficient Federated Learning method suitable for low - memory AIoT devices by combining forward - looking pruning and zeroth - order optimization, solving the memory bottleneck problems encountered when applying existing methods on these devices.