Utility Aware Optimal Data Selection for Differentially Private Federated Learning in IoV

Jiancong Zhang,Shining Li,Changhao Wang
DOI: https://doi.org/10.1109/jiot.2024.3427132
2024-01-01
Abstract:Federated learning coordinates distributed data sets to train models, which brings the significant impact of data selection on model performance. Personalized differential privacy, however, introduces heterogeneity into the vehicular data sets: the higher privacy protection may reduce the contribution of local models to model convergence. Therefore, the goal of this article is to dynamically optimize the combination of data sets to tackle the heterogeneity in differential private federated learning in Internet of Vehicles. This is extremely challenging without direct data access and a visible training process. Therefore, we propose an efficient hierarchical data selection method. First, the utility is evaluated using the convergence bound derived from the noise function and the cost function. Accordingly, a collection of high-value clients is selected to maximize the potential contribution of the combination to the global model. Then, we design an optimization function based on the unknown variables within the convergence bound and develop a low-complexity algorithm to approximate the sampling probability. Meanwhile, the aggregation weight of each model is adjusted to ensure unbiased estimation. Experimental results on two real-world trajectory data sets show that the scheme can reduce the meter error by 8.90% and 15.97%, respectively, and improve the convergence speed by 23.9% and 27.1%, respectively.
What problem does this paper attempt to address?