Abstract:Recent advances in wearable devices and Internet-of-Things (IoT) have led to massive growth in sensor data generated in edge devices. Labeling such massive data for classification tasks has proven to be challenging. In addition, data generated by different users bear various personal attributes and edge heterogeneity, rendering it impractical to develop a global model that adapts well to all users. Concerns over data privacy and communication costs also prohibit centralized data accumulation and training. We propose SemiPFL that supports edge users having no label or limited labeled datasets and a sizable amount of unlabeled data that is insufficient to train a well-performing model. In this work, edge users collaborate to train a Hyper-network in the server, generating personalized autoencoders for each user. After receiving updates from edge users, the server produces a set of base models for each user, which the users locally aggregate them using their own labeled dataset. We comprehensively evaluate our proposed framework on various public datasets from a wide range of application scenarios, from wearable health to IoT, and demonstrate that SemiPFL outperforms state-of-art federated learning frameworks under the same assumptions regarding user performance, network footprint, and computational consumption. We also show that the solution performs well for users without label or having limited labeled datasets and increasing performance for increased labeled data and number of users, signifying the effectiveness of SemiPFL for handling data heterogeneity and limited annotation. We also demonstrate the stability of SemiPFL for handling user hardware resource heterogeneity in three real-time scenarios.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are as follows: 1. **Labeling of large - scale sensor data**: With the development of wearable devices and the Internet of Things (IoT), edge devices generate a large amount of sensor data. However, labeling for classification tasks of these data has become very challenging. The labeling process is not only time - consuming and costly, but also centralized data accumulation and training are not feasible due to privacy issues. 2. **Heterogeneity of user data and personalized needs**: Data generated by different users have different personal attributes and edge heterogeneity, making it impractical to develop a global model that suits all users. In addition, there are differences in the hardware resources of user devices, which further increases the difficulty of training personalized models. 3. **Model training with limited labeled data**: Many users may not have enough labeled data to train high - performance models, especially on edge devices. Therefore, a method that can effectively train models with little or no labeled data is required. 4. **Collaborative training while protecting user privacy**: In order not to violate user privacy, let users participate in collaborative training and ensure that the model performance is better than that of traditional supervised learning methods. To this end, the author proposes SemiPFL (Semi - Supervised Personalized Federated Learning), which is a personalized semi - supervised federated learning framework aiming to solve the above problems. Specifically, SemiPFL solves the problems in the following ways: - **Generate personalized auto - encoders using Hyper - network**: The server uses Hyper - network to generate personalized auto - encoders for each user, thus adapting to the personalized needs of different users. - **Combine labeled and unlabeled data for training**: Users can use their local small amount of labeled data and large amount of unlabeled data to update the model, thereby improving the model performance. - **Handle hardware resource heterogeneity**: The influence of different hardware resources on model performance is studied to ensure that the framework can run stably under different hardware configurations. - **Provide better personalized models**: Compared with the traditional global model, SemiPFL can provide each user with a more personalized model that adapts to their own data distribution, thereby improving the overall performance. In summary, SemiPFL aims to effectively handle the classification tasks of large - scale multi - sensor time - series data through a personalized semi - supervised federated learning framework, especially in the case of limited labeled data while protecting user privacy.

SemiPFL: Personalized Semi-Supervised Federated Learning Framework for Edge Intelligence

Deep Learning Based Coded Over-the-Air Computation for Personalized Federated Learning

Towards Fast Personalized Semi-Supervised Federated Learning in Edge Networks: Algorithm Design and Theoretical Guarantee

Optimizing Efficient Personalized Federated Learning with Hypernetworks at Edge

Knowledge-Enhanced Semi-Supervised Federated Learning for Aggregating Heterogeneous Lightweight Clients in IoT

PPSFL: Privacy-Preserving Split Federated Learning for heterogeneous data in edge-based Internet of Things

Personalized Edge Intelligence via Federated Self-Knowledge Distillation

Semi-Federated Learning for Internet of Intelligence

Edge-assisted U-Shaped Split Federated Learning with Privacy-preserving for Internet of Things

Personalized Federated Learning for Intelligent IoT Applications: A Cloud-Edge based Framework

Semi-Federated Learning for Collaborative Intelligence in Massive IoT Networks

Personalized federated learning for heterogeneous data: A distributed edge clustering approach

Semi-Federated Learning for Connected Intelligence With Computing-Heterogeneous Devices

Towards Fairer and More Efficient Federated Learning via Multidimensional Personalized Edge Models

Federated semi-supervised learning with tolerant guidance and powerful classifier in edge scenarios

Personalized Federated Learning Incorporating Adaptive Model Pruning at the Edge

PFLF: Privacy-Preserving Federated Learning Framework for Edge Computing

Adaptive Personalized Federated Learning With One-Shot Screening

Energy-Aware Edge Association for Cluster-based Personalized Federated Learning