Federated Learning Framework in Fogbus2-based Edge Computing Environments

Wuji Zhu
DOI: https://doi.org/10.48550/arXiv.2211.07238
2022-11-14
Abstract:Federated learning refers to conducting training on multiple distributed devices and collecting model weights from them to derive a shared machine-learning model. This allows the model to get benefit from a rich source of data available from multiple sites. Also, since only model weights are collected from distributed devices, the privacy of those data is protected. It is useful in a situation where collaborative training of machine learning models is necessary while training data are highly sensitive. This study aims at investigating the implementation of lightweight federated learning to be deployed on a diverse range of distributed resources, including resource-constrained edge devices and resourceful cloud servers. As a resource management framework, the FogBus2 framework, which is a containerized distributed resource management framework, is selected as the base framework for the implementation. This research provides an architecture and lightweight implementation of federated learning in the FogBus2. Moreover, a worker selection technique is proposed and implemented. The worker selection algorithm selects an appropriate set of workers to participate in the training to achieve higher training time efficiency. Besides, this research integrates synchronous and asynchronous models of federated learning alongside with heuristic-based worker selection algorithm. It is proven that asynchronous federated learning is more time efficient compared to synchronous federated learning or sequential machine learning training. The performance evaluation shows the efficiency of the federated learning mechanism implemented and integrated with the FogBus2 framework. The worker selection strategy obtains 33.9% less time to reach 80% accuracy compared to sequential training, while asynchronous further improve synchronous federated learning training time by 63.3%.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?