GLE-Dedup: A Globally–Locally Even Deduplication by Request-Aware Placement for Better Read Performance

Mingzhu Deng,Wei Chen,Nong Xiao,Songping Yu,Yupeng Hu
DOI: https://doi.org/10.1007/s10766-016-0450-5
2016-01-01
International Journal of Parallel Programming
Abstract:Deduplication serves as a fundamental way to eliminate replicas and save space and network bandwidth in various storage systems. However, the performance of most existing deduplication systems can be further improved on normal reads, which carry crucial weight in currently popular WORM access model. Specifically, most existing deduplication systems achieve globally even layout via the simple round-robin algorithm and ignore the interrelationship between chunks and IO requests in the placement policy, thus failing to achieve the local even placement within a request and causing read imbalance problem. In this paper, we focus on deduplication over small-scale storage systems with adequate bandwidth in between and propose a deduplication system with request-aware placement policy named GLE-Dedup to achieve even placement both globally and locally for better read performance. Differing from conventional approaches of chunk-based placement, GLE-Dedup employs a group placement for chunks and the group size is mainly determined by the request ID to achieve request-awareness. We place chunks belonging to the same IO request into different independent nodes as much as possible to achieve even placement locally within a request and meanwhile maintain global balance with rotation among chunk groups. In this way, better parallelism is exploited for higher read performance. Experiment results under the real-world CAFTL trace have shown the effectiveness and advantage of GLE-Dedup over B-Dedup and R-Dedup respectively under round-robin and random placement. For example, our GLE-Dedup could achieve about 18.9 and 24 % read improvement respectively compared with B-Dedup and R-Dedup.
What problem does this paper attempt to address?