Abstract:The number of customers transferring information to cloud storage has grown significantly, with the rising prevalence of cloud computing. The rapidly rising data volume in the cloud, mostly on one side, is followed by a large replication of data. On the other hand, if there is a single duplicate copy of stored symmetrical information in the de-duplicate cloud backup the manipulation or lack of a single copy may cause untold failure. Thus, the deduplication of files and the auditing of credibility are extremely necessary and how they are achieved safely and effectively must be addressed in academic and commercial contexts urgently. In order to tune in this task by using application recognition, data similitude, and locality to simplify decentralized deduplication with two-tier internode and application deduction, we suggest a flexible direct decentralized symmetry deduplication architecture in a cloud scenario. It first distributes application logic to the contents of the directory through implementation-oriented steering to maintain a deployment location and also attributes the same kind of information to the cloud backup node with the storage node specificity by means of a hand printing-based network model to attain adequate global deduplication performance. We build up a new ownership mechanism during file deduplication to ensure continuity of tagging and symmetrical modeling and verify shared ownership. In addition, we plan an effective ownership policy maintenance plan. In order to introduce a probabilistic key process and reduce key storage capacity, a user-helped key is used for in-user block deduplication. Finally, the protection and efficiency audit demonstrate that the data integrity and accuracy of our system are ensured and symmetrically effective in the management of data ownership.

Application-Aware Big Data Deduplication in Cloud Environment

Boafft: Distributed Deduplication for Big Data Storage in the Cloud

A Delayed Container Organization Approach to Improve Restore Speed for Deduplication Systems.

Decentralized and Privacy Sensitive Data De-Duplication Framework for Convenient Big Data Management in Cloud Backup Systems

PeerDedupe: Insights into the Peer-Assisted Sampling Deduplication.

A Novel Optimization Method to Improve De-duplication Storage System Performance

Research on Data Routing Strategy of Deduplication in Cloud Environment

Droplet: A Distributed Solution of Data Deduplication

Adaptive Pipeline for Deduplication

A Thorough Investigation of Content-Defined Chunking Algorithms for Data Deduplication

FuzzyDedup: Secure Fuzzy Deduplication for Cloud Storage

D3: A Dynamic Dual-Phase Deduplication Framework for Distributed Primary Storage.

Duplicacy: A New Generation of Cloud Backup Tool Based on Lock-Free Deduplication

Try Managing Your Deduplication Fine-Grained-ly: A Multi-tiered and Dynamic SLA-Driven Deduplication Framework for Primary Storage.

SAUD: Semantics-Aware and Utility-Driven Deduplication Framework for Primary Storage.

Towards Cluster-wide Deduplication Based on Ceph

Ss-Dedup : A High Throughput Stateful Data Routing Algorithm For Cluster Deduplication System

GLE-Dedup: A Globally–Locally Even Deduplication by Request-Aware Placement for Better Read Performance

DCStore: A Deduplication-Based Cloud-of-Clouds Storage Service

Edge Data Deduplication Under Uncertainties: A Robust Optimization Approach

ESDedup: An efficient and secure deduplication scheme based on data similarity and blockchain for cloud-assisted medical storage systems