MergeSFL: Split Federated Learning with Feature Merging and Batch Size Regulation

Yunming Liao,Yang Xu,Hongli Xu,Lun Wang,Zhiwei Yao,Chunming Qiao
2024-07-22
Abstract:Recently, federated learning (FL) has emerged as a popular technique for edge AI to mine valuable knowledge in edge computing (EC) systems. To mitigate the computing/communication burden on resource-constrained workers and protect model privacy, split federated learning (SFL) has been released by integrating both data and model parallelism. Despite resource limitations, SFL still faces two other critical challenges in EC, i.e., statistical heterogeneity and system heterogeneity. To address these challenges, we propose a novel SFL framework, termed MergeSFL, by incorporating feature merging and batch size regulation in SFL. Concretely, feature merging aims to merge the features from workers into a mixed feature sequence, which is approximately equivalent to the features derived from IID data and is employed to promote model accuracy. While batch size regulation aims to assign diverse and suitable batch sizes for heterogeneous workers to improve training efficiency. Moreover, MergeSFL explores to jointly optimize these two strategies upon their coupled relationship to better enhance the performance of SFL. Extensive experiments are conducted on a physical platform with 80 NVIDIA Jetson edge devices, and the experimental results show that MergeSFL can improve the final model accuracy by 5.82% to 26.22%, with a speedup by about 1.74x to 4.14x, compared to the baselines.
Machine Learning,Distributed, Parallel, and Cluster Computing,Networking and Internet Architecture
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in the Edge Computing (EC) system, the problems of statistical heterogeneity and system heterogeneity faced by Split Federated Learning (SFL). Specifically: 1. **Statistical Heterogeneity**: - Edge devices (such as Internet of Things devices) often collect non - independent and identically distributed (non - IID) data due to differences in geographical location and user preferences. Such differences in data distribution will lead to a slower model convergence rate and even affect the model's accuracy. 2. **System Heterogeneity**: - There are significant differences in the computing and communication capabilities of edge devices. For example, the CPU frequencies, bandwidths, and throughputs of different devices may differ by more than ten times. This results in fast devices having to wait for slow devices during the synchronous training process, increasing the waiting time and reducing the training efficiency. To address these challenges, the author proposes a new SFL framework - MergeSFL, which solves the above problems by introducing Feature Merging and Batch Size Regulation. Specifically: - **Feature Merging**: Merge features from different devices into a mixed feature sequence, which is similar to the features extracted from IID data, thereby promoting the model's accuracy. - **Batch Size Regulation**: Allocate different batch sizes to devices with different capabilities to adapt to their computing and communication capabilities and improve training efficiency. Through the joint optimization of these two strategies, MergeSFL can better cope with statistical and system heterogeneity and improve the final accuracy and training efficiency of the model. Experimental results show that, compared with the baseline method, MergeSFL can increase the accuracy of the final model by 5.82% to 26.22% and accelerate it by about 1.39 to 4.14 times.