A Noise Generation Scheme Based on Huffman Coding for Preserving Privacy

Iuon-Chang Lin,Li-Cheng Yang
DOI: https://doi.org/10.1007/978-3-319-76451-1_15
2018-01-01
Abstract:AbstractThe cloud computing technique rises in these years. Due to cloud computing techniques have some features including low cost, robustness, flexibility and ubiquitous nature. The data in organization will increase immediately. A large number of data can be used on many applications of data analysis involves business, medical and government. But it has some privacy issues, if dealer wants to understand their customer behavior for requirement of marketing, they may publish data into data analysis company, third-party, to analysis. To preserve privacy in database, this paper proposes an efficient noise generation scheme which is based on Huffman coding algorithm. The features of Huffman coding algorithm are a character with lower occurrence frequency has longer code and vice versa. It is suitable to be applied on protecting privacy on database, that tuple with lower occurrence frequency has more noise. The paper presents a noise matrix, a set of noise, which is based on this concept. Although this scheme may lead to data distortion by replace original value, but does not affect to data analysis. In the section of experiments, we consider running time of noise generation with integer number and real number. Overall, this paper shares different concept to perturb original value and propose an efficient data perturbation scheme.
What problem does this paper attempt to address?