DEDUCT: A Secure Deduplication of Textual Data in Cloud Environments

Kiana Ghassabi,Peyman Pahlevani
DOI: https://doi.org/10.1109/access.2024.3402544
IF: 3.9
2024-05-24
IEEE Access
Abstract:The exponential growth of textual data in Vision-and-Language Navigation tasks poses significant challenges for data management in large-scale storage systems. Data deduplication has emerged as a practical strategy for data reduction in large-scale storage systems; however, it has also raised security concerns. This paper introduces DEDUCT, an innovative data deduplication method for textual data. DEDUCT employs a hybrid approach that combines cloud-side and client-side deduplication mechanisms to achieve high compression rates while maintaining data security. DEDUCT's lightweight preprocessing and client-side deduplication make it suitable for resource-constrained devices like IoT devices. It has also been designed to resist side-channel attacks. Experimental evaluations on the Touchdown dataset, consisting of human-written navigation instructions for routes, demonstrate the effectiveness of DEDUCT. It achieves compression rates of nearly 66%, significantly reducing storage requirements while preserving the confidentiality of textual data. This substantial reduction in storage demands can lead to significant cost savings and improved efficiency in large-scale data management systems.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?