DPBalance: Efficient and Fair Privacy Budget Scheduling for Federated Learning as a Service

Yu Liu,Zibo Wang,Yifei Zhu,Chen Chen
2024-02-15
Abstract:Federated learning (FL) has emerged as a prevalent distributed machine learning scheme that enables collaborative model training without aggregating raw data. Cloud service providers further embrace Federated Learning as a Service (FLaaS), allowing data analysts to execute their FL training pipelines over differentially-protected data. Due to the intrinsic properties of differential privacy, the enforced privacy level on data blocks can be viewed as a privacy budget that requires careful scheduling to cater to diverse training pipelines. Existing privacy budget scheduling studies prioritize either efficiency or fairness individually. In this paper, we propose DPBalance, a novel privacy budget scheduling mechanism that jointly optimizes both efficiency and fairness. We first develop a comprehensive utility function incorporating data analyst-level dominant shares and FL-specific performance metrics. A sequential allocation mechanism is then designed using the Lagrange multiplier method and effective greedy heuristics. We theoretically prove that DPBalance satisfies Pareto Efficiency, Sharing Incentive, Envy-Freeness, and Weak Strategy Proofness. We also theoretically prove the existence of a fairness-efficiency tradeoff in privacy budgeting. Extensive experiments demonstrate that DPBalance outperforms state-of-the-art solutions, achieving an average efficiency improvement of $1.44\times \sim 3.49 \times$, and an average fairness improvement of $1.37\times \sim 24.32 \times$.
Distributed, Parallel, and Cluster Computing,Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
This paper proposes a privacy budget scheduling mechanism called DPBalance to address the efficiency and fairness issues in Federated Learning as a Service (FLaaS). In FLaaS, multiple data analysts train models on data from different data owners while using Differential Privacy (DP) to protect data privacy. Existing privacy budget scheduling methods either focus on efficiency or fairness, but not both. DPBalance achieves joint optimization of efficiency and fairness by developing a comprehensive utility function that combines the dominant share at the data analyst level and FL-specific performance metrics. It uses the Lagrange multiplier method and an efficient greedy heuristic algorithm to design a sequential allocation mechanism. The paper proves that DPBalance satisfies Pareto efficiency, sharing incentive, envy-freeness, and weak-strategy proofness, and theoretically demonstrates the trade-off between fairness and efficiency under practical conditions. Experimental results demonstrate that DPBalance outperforms existing solutions in terms of both efficiency and fairness, with average efficiency improvements ranging from 1.44 to 3.49 times and fairness improvements ranging from 1.37 to 24.32 times. The paper concludes with a discussion of related work, introduction of background and motivation, system modeling and problem formalization, algorithm design and theoretical analysis, experimental evaluation, and conclusions.