Enhancing Semi-Supervised Federated Learning with Progressive Training in Heterogeneous Edge Computing

Jianchun Liu,Jun Liu,Hongli Xu,Yunming Liao,Zhiwei Yao,Min Chen,Chen Qian
DOI: https://doi.org/10.1109/tmc.2024.3492140
IF: 6.075
2024-01-01
IEEE Transactions on Mobile Computing
Abstract:Federated learning (FL) is an efficient distributed learning method that facilitates collaborative model training among multiple edge devices (or clients). However, current research always assumes that clients have access to ground-truth data for training, which is unrealistic in practice because of a lack of expertise. Semi-supervised federated learning (SSFL) has been proposed in many existing works to address this problem, which always adopts a fixed model architecture for training, bringing two main problems with varying amounts of pseudo-labeled data. First, the shallow model cannot have the capability to fit the increasing pseudo-labeled data, leading to poor training performance. Second, the large model suffers from an overfitting problem when exploiting a few labeled data samples in SSFL, and also requires tremendous resource ( e.g. , computation and communication) costs. To tackle these problems, we propose a novel framework, called STAR , which adopts progressive training to enhance model training in SSFL. Specifically, STAR gradually increases the model depth through adding the sub-module ( e.g. , one or several layers) from a shallow model, and performs pseudo-labeling for unlabeled data with a specialized confidence threshold simultaneously. Then, we propose an efficient algorithm to determine the appropriate model depth for each client with varied resource budgets and the proper confidence threshold for pseudo-labeling in SSFL. Our proposed framework STAR innovatively applies progressive training to SSFL, which significantly contributes to the advancement of the FL field. STAR has been evaluated through extensive experiments, and the results demonstrate its high effectiveness. For instance, STAR can reduce the bandwidth consumption by about 40%, and achieve an average accuracy improvement of around 9.8% compared with the baselines, on CIFAR10. Besides, STAR achieves about 2.2× speedup compared to the baselines on ImageNet100.
What problem does this paper attempt to address?