Enabling Ciphertext Deduplication for Secure Cloud Storage and Access Control

Heyi Tang,Yong Cui,Chaowen Guan,Jianping Wu,Jian Weng,Kui Ren
DOI: https://doi.org/10.1145/2897845.2897846
2016-01-01
Abstract:To secure cloud storage and enforce access control, data encryption has become essential, given the ever increasing cyber threat everywhere. Attribute-based Encryption (ABE) crypto systems are widely considered as a promising solution under such a context for its security strength, scalability and control flexibility. One major challenge, however, for applying ABE-based techniques in real world applications is its high overhead in various aspects. In this research, we are particularly concerned with the storage size expansion in existing ABE schemes. This combined with the vast-size nature of the cloud data poses an enormous challenge to the effective usage of the cloud data storage space and affects the utility of data deduplication. Normally, data deduplication is carried out based on identifying similar and even identical contents both within and between data files, however, these patterns will be destroyed after performing data encryption using any semantically secure encryption scheme including ABE. In this research, we focus on ciphertexts deduplication under ABE, which to our best knowledge is the first of such an effort. Our fundamental observation stems from the structure of ABE ciphertexts and the possible similarities among different access structures. We show how to design a secure ciphertext deduplication scheme based on a classical CP-ABE scheme by innovatively modifying the construction with a recursive algorithm, eliminating the duplicated secrets and adding additional randomness to some certain ciphertext. We then give a detailed analysis on the proposed scheme with respect to both efficiency and security. To thoroughly assess the performance of the proposed scheme, we also implement a prototype system and conduct comprehensive experiments, which shows that our ciphertext reduplication scheme could reduce up to 80% computation and storage cost in the best case.
What problem does this paper attempt to address?