Niousha Nazemi,Omid Tavallaie,Shuaijun Chen,Anna Maria Mandalari,Kanchana Thilakarathna,Ralph Holz,Hamed Haddadi,Albert Y. Zomaya
Abstract:Federated Learning (FL) is a promising distributed learning framework designed for privacy-aware applications. FL trains models on client devices without sharing the client's data and generates a global model on a server by aggregating model updates. Traditional FL approaches risk exposing sensitive client data when plain model updates are transmitted to the server, making them vulnerable to security threats such as model inversion attacks where the server can infer the client's original training data from monitoring the changes of the trained model in different rounds. Google's Secure Aggregation (SecAgg) protocol addresses this threat by employing a double-masking technique, secret sharing, and cryptography computations in honest-but-curious and adversarial scenarios with client dropouts. However, in scenarios without the presence of an active adversary, the computational and communication cost of SecAgg significantly increases by growing the number of clients. To address this issue, in this paper, we propose ACCESS-FL, a communication-and-computation-efficient secure aggregation method designed for honest-but-curious scenarios in stable FL networks with a limited rate of client dropout. ACCESS-FL reduces the computation/communication cost to a constant level (independent of the network size) by generating shared secrets between only two clients and eliminating the need for double masking, secret sharing, and cryptography computations. To evaluate the performance of ACCESS-FL, we conduct experiments using the MNIST, FMNIST, and CIFAR datasets to verify the performance of our proposed method. The evaluation results demonstrate that our proposed method significantly reduces computation and communication overhead compared to state-of-the-art methods, SecAgg and SecAgg+.
What problem does this paper attempt to address?
### What problems does this paper attempt to solve?
This paper aims to solve the problem of excessive computation and communication costs of Secure Aggregation in Federated Learning (FL) in a stable network environment. Specifically, traditional secure aggregation methods such as Google - proposed Secure Aggregation (SecAgg) and the improved version SecAgg+ when dealing with large - scale clients, due to the need to generate shared secrets, perform encryption and decryption operations, and handle client disconnections, etc., result in a significant increase in computation and communication overhead.
#### Main problems include:
1. **High computation cost**:
- In SecAgg, each client needs to generate shared secrets for all other clients and create unique random elements. These values are expanded into shared and individual masks through a pseudo - random generator function, making the client's computational complexity reach \(O(|C|^2)\). Especially in large - scale FL systems (such as Google Gboard with one billion clients), this poses a huge computational challenge to client devices.
- The server also needs to bear the tasks of reconstructing the shared masks of disconnected clients and regenerating the self - masks of participating clients, further increasing the computational burden.
2. **High communication cost**:
- Each client in SecAgg needs to send two public keys, shares of encrypted private keys and random elements, as well as its masked model updates, resulting in a communication cost of \(O(|C|)\).
- The server is responsible for distributing encrypted values and broadcasting public aggregated model updates, making the communication cost rise to \(O(|C|^2)\). In large - scale FL scenarios (such as thousands of participating devices in smart city applications), the server's communication load becomes a significant bottleneck.
3. **Handling client disconnections or delayed messages**:
- In actual FL scenarios, an unstable Internet connection may interrupt the process of creating the global model. SecAgg uses a double - masking technique to ensure that each client's model update remains secure in the case of user disconnection or delayed update, but this involves multiple encryption operations, increasing the computation and communication overhead.
#### Solutions proposed in the paper:
To address the above challenges, the paper proposes a new communication - and - computation - efficient protocol - ACCESS - FL. This protocol mainly solves the problems in the following ways:
- **Reducing the number of shared masks**: Each client only needs to generate two shared masks, regardless of the network scale, thereby significantly reducing communication and computation overhead.
- **Eliminating high - computation - cost encryption operations**: There is no longer a need to perform high - computation - cost operations such as encryption, decryption, or Shamir's secret sharing.
- **Simplifying the communication process**: The client only needs to share one public key and a masked model, reducing the number of messages transmitted in the network.
- **Reducing the server's computation cost**: By removing the need to handle mask cancellation, the server's computational burden is reduced.
Through these improvements, ACCESS - FL can effectively reduce computation and communication costs in honest - but - curious FL scenarios, especially in stable network environments, while maintaining security against model inversion attacks.