PeerDedupe: Insights into the Peer-Assisted Sampling Deduplication.

Yuanjian Xing,Zhenhua Li,Yafei Dai
DOI: https://doi.org/10.1109/p2p.2010.5570004
2010-01-01
Abstract:As the digital data rapidly inflates to a world-wide storage crisis, data deduplication is showing its increasingly prominent function in data storage. Driven by the problems behind the mainstream server-side deduplication schemes, recently there has been a tendency of introducing peer-assisted methods into the deduplication systems. However, this topic is still quite vague at present and lacks thorough research. In this paper, we conduct in-depth and quantitative investigation on the peer-assisted deduplication. Through measurements we observe that the inter-peer duplication accounts for a large proportion of the total duplication, and exhibits strong peer locality. Then based on our observations, we propose PeerDedupe, a novel peer-assisted sampling deduplication approach. Experiments show that PeerDedupe can remove over 98% duplication with each peer coordinating with no more than 5 other peers, and it requires much less server RAM usage than the existing works.
What problem does this paper attempt to address?