Cluster-Based Secure Aggregation for Federated Learning

Jien Kim,Gunryeong Park,Miseung Kim,Soyoung Park
DOI: https://doi.org/10.3390/electronics12040870
IF: 2.9
2023-02-09
Electronics
Abstract:In order to protect each node's local learning parameters from model inversion attacks, secure aggregation has become the essential technique for federated learning so that the federated learning server knows only the combined result of all local parameters. In this paper, we introduced a novel cluster-based secure aggregation model that effectively deals with dropout nodes while reducing communicational and computational overheads. Specifically, we considered a federated learning environment with heterogeneous devices deployed across the country. The computing power of each node and the amount of training data can be heterogeneous. Because of this, each node had a different local processing time, and the response time to the server is also different. To clearly determine the dropout nodes in this environment, our model clusters nodes with similar response times based on each node's local processing time and location and then performs the aggregation on a pre-cluster basis. In addition, we propose a new practical additive sharing-based masking protocol to hide the actual local updates of nodes during aggregation. The new masking protocol makes it easy to remove the share of dropout nodes from the aggregation without using a (t, n) threshold scheme, and updates from dropout nodes are still secure even if they are delivered to the server after the dropout shares have been revealed. In addition, our model provides mask verification for reliable aggregation. Nodes can publicly verify the correctness and integrity of the masks received from others using a discrete logarithm problem before the aggregation. As a result, the proposed aggregation model is robust to dropout nodes and ensures the privacy of local updates if at least three honest nodes are alive in each cluster. Since the masking process is performed on a cluster basis, our model effectively reduces the overhead of generating and sharing the masking value. For an average cluster size C and a total number of nodes N, the computation and communication cost of each node is O(C), the computation cost of the server is O(N), and the communication cost is O(NC). We analyzed the security and efficiency of our protocol by simulating diverse dropout scenarios. The simulated results showed that our cluster-based secure aggregation outputs about a 91% learning accuracy regardless of dropout rate with four clusters for one hundred nodes.
engineering, electrical & electronic,computer science, information systems,physics, applied
What problem does this paper attempt to address?