DeDu: Building a Deduplication Storage System over Cloud Computing.

Zhe Sun,Jun Shen,Jianming Yong
DOI: https://doi.org/10.1109/cscwd.2011.5960097
2011-01-01
Abstract:This paper presents a deduplication storage system over cloud computing. Our deduplication storage system consists of two major components, a front-end deduplication application and Hadoop Distributed File System. Hadoop Distributed File System is common back-end distribution file system, which is used with a Hadoop database. We use Hadoop Distributed File System to build up a mass storage system and use a Hadoop database to build up a fast indexing system. With the deduplication applications, a scalable and parallel deduplicated cloud storage system can be effectively built up. We further use VMware to generate a simulated cloud environment. The simulation results demonstrate that our deduplication cloud storage system is more efficient than traditional deduplication approaches.
What problem does this paper attempt to address?