Exploring the Impact of Non-IID on Federated Learning

Minyang Li,Xiangyun Tang,Siru Chen,Yu Weng,Luyao Peng,Wen Yang
DOI: https://doi.org/10.1109/icbctis59921.2023.00032
2023-01-01
Abstract:In the past few years, the growing problem of data silos has given rise to an emerging framework for distributed deep learning: Federated learning. The framework enables the construction of global models across multiple participants without the need for directly sharing raw data. However, because the environments oriented towards Federated learning are usually complex and heterogeneous, resulting in varying degrees of data distribution among individual local devices, the distribution of data usually does not satisfy the assumption of non-independent and identically distributed. In this paper, several different types of non-independent and identically distributed datasets are obtained by dividing the MNIST dataset using various methods. Each dataset is then trained using the FedAvg algorithm to explore the effects of different datasets on the training of Federated learning models. It is finally concluded that in the process of Federated learning training, when the non-independent and identically distributed dataset contains partially independent and identically distributed data, a better training model can be obtained by appropriately adjusting the training parameters of the model. This paper also introduces the process of parameter debugging and analyzes the model training results under different parameters to improve the efficiency of Federated learning training models.
What problem does this paper attempt to address?