Deep Equilibrium Models Meet Federated Learning

Alexandros Gkillas,Dimitris Ampeliotis,Kostas Berberidis
2023-05-30
Abstract:In this study the problem of Federated Learning (FL) is explored under a new perspective by utilizing the Deep Equilibrium (DEQ) models instead of conventional deep learning networks. We claim that incorporating DEQ models into the federated learning framework naturally addresses several open problems in FL, such as the communication overhead due to the sharing large models and the ability to incorporate heterogeneous edge devices with significantly different computation capabilities. Additionally, a weighted average fusion rule is proposed at the server-side of the FL framework to account for the different qualities of models from heterogeneous edge devices. To the best of our knowledge, this study is the first to establish a connection between DEQ models and federated learning, contributing to the development of an efficient and effective FL framework. Finally, promising initial experimental results are presented, demonstrating the potential of this approach in addressing challenges of FL.
Machine Learning,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The problems that this paper attempts to solve are two main challenges in Federated Learning (FL): 1. **Communication Burden**: In federated learning, frequent communication between edge devices and the central server can lead to a large amount of communication overhead. This is because traditional deep - learning models have a large number of parameters, and transmitting these parameters requires a great deal of bandwidth and time. 2. **Device Heterogeneity**: Edge devices (such as IoT devices) participating in federated learning have significant differences in computing power and resources. Some devices may have strong computing power, while others have limited resources. This heterogeneity makes it difficult for all devices to use the same model and may lead to performance degradation or low training efficiency. To address these problems, the author proposes a new method, that is, introducing the Deep Equilibrium (DEQ) model into the federated learning framework. The DEQ model forms an efficient infinitely - deep neural network by representing the entire deep - learning model architecture as a single - layer or unit equilibrium (fixed - point) calculation. This method has the following advantages: - **Improved Communication Efficiency**: The DEQ model can significantly reduce the number of model parameters that need to be transmitted, because they only need to transmit a small number of parameters defining the transformation function \( f_\theta(\cdot) \), rather than a large number of parameters in traditional multi - layer networks. - **Reduced Memory Requirements**: Due to the compressed representation of the DEQ model, the memory required on the client and server sides is greatly reduced, which enables resource - limited devices to also participate in federated learning. - **Support for Heterogeneous Devices**: The DEQ model can dynamically adjust the number of fixed - point iterations according to the computing power of the device, thereby adapting to devices with different computing powers. This means that devices with strong computing power can perform more iterations to achieve better accuracy, while devices with weak computing power can provide effective model updates with fewer iterations. In addition, the author also proposes a weighted - average fusion rule for effectively aggregating DEQ model updates from different devices on the server side, further improving the flexibility and efficiency of the system. In summary, this research aims to solve the communication efficiency and device heterogeneity problems in the existing federated learning framework by combining the DEQ model and federated learning, thereby constructing a more efficient and flexible federated learning system.