RASE: Efficient Privacy-preserving Data Aggregation against Disclosure Attacks for IoTs

Zuyan Wang,Jun Tao,Dika Zou
2024-05-31
Abstract:The growing popular awareness of personal privacy raises the following quandary: what is the new paradigm for collecting and protecting the data produced by ever-increasing sensor devices. Most previous studies on co-design of data aggregation and privacy preservation assume that a trusted fusion center adheres to privacy regimes. Very recent work has taken steps towards relaxing the assumption by allowing data contributors to locally perturb their own data. Although these solutions withhold some data content to mitigate privacy risks, they have been shown to offer insufficient protection against disclosure attacks. Aiming at providing a more rigorous data safeguard for the Internet of Things (IoTs), this paper initiates the study of privacy-preserving data aggregation. We propose a novel paradigm (called RASE), which can be generalized into a 3-step sequential procedure, noise addition, followed by random permutation, and then parameter estimation. Specially, we design a differentially private randomizer, which carefully guides data contributors to obfuscate the truth. Then, a shuffler is employed to receive the noisy data from all data contributors. After that, it breaks the correct linkage between senders and receivers by applying a random permutation. The estimation phase involves using inaccurate data to calculate an approximate aggregate value. Extensive simulations are provided to explore the privacy-utility landscape of our RASE.
Cryptography and Security
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to efficiently perform privacy - protected data aggregation in the Internet of Things (IoT) environment to resist leakage attacks. Specifically, with the increase in personal privacy awareness, how to provide a new paradigm when collecting and protecting data generated by an increasing number of sensor devices has become an urgent problem to be solved. ### Problem Description 1. **Privacy Leakage Risks**: - In existing research, most studies on the co - design of data aggregation and privacy protection assume that there is a trusted fusion center to comply with privacy regulations. However, this assumption does not always hold in practical applications. - Recent work relaxes this assumption by allowing data contributors to locally perturb their own data, but these solutions still cannot provide sufficient protection, especially when facing disclosure attacks. For example, a malicious third party may be able to reverse - engineer the noise - adding process through external knowledge (such as IP addresses) and ultimately expose the unchanged data. 2. **Balance between Data Utility and Privacy**: - The accuracy of data is crucial for practical applications. Blind noise addition may reduce the quality of service, and too little noise may lead to data decryption. - For example, in location - based services, if the injected noise is too large, it may lead to an increase in errors in resource allocation decisions and affect system stability. ### Research Motivation Therefore, the question raised in this paper is: Can customized data utility be achieved while suppressing disclosure attacks? Ideally, a lower privacy cost can be maintained while enhancing privacy protection. ### Solution Overview To solve the above problems, the paper proposes a new Randomization - Shuffling - Estimation paradigm (RASERASERASE), and its main steps are as follows: 1. **Noise Addition**: Each data contributor adds noise to the original data on their local device to obfuscate the real data. 2. **Random Permutation**: Use a random permutator to receive all noisy data and break the correct association between the sender and the receiver through random permutation. 3. **Parameter Estimation**: Calculate an approximate aggregated value using inaccurate data. ### Main Contributions - Propose a dynamic feedback mechanism to standardize the data error range from the fusion center. - Design a budget - aware local randomizer (BR) and a robust shuffler (RS) to achieve privacy protection. - Introduce several estimators (such as the sample mean estimator, the maximum likelihood estimator, and the bootstrap estimator) to approximate the average of the aggregated data. Through these methods, RASERASERASE can not only effectively protect data privacy but also ensure data availability to a certain extent. Experimental results show that this new paradigm achieves a better trade - off between privacy and utility and is superior to existing algorithms. ### Formula Display - Define interval accuracy: \[ \Pr\left[y_i \geq (1-\beta)x_i \text{ and } y_i \leq (1 + \beta)x_i\right] \geq \rho \] where \(\beta, \rho\in[0,1]\), indicating that the noise version \(y_i\) is within a certain range of the original data \(x_i\) with a probability of \(\rho\). - Lower bound of the privacy budget of the Laplace mechanism: \[ \epsilon_s \geq -\frac{\Delta(x)\cdot\ln(1 - \rho)}{\beta x_{\max}} \] where \(\Delta(x)=x_{\max}-x_{\min}\), representing the data range. Through these formulas and methods, the paper provides an effective solution to address the privacy protection challenges in the IoT environment.