SLIMSTORE: A Cloud-based Deduplication System for Multi-version Backups
Zihao Zhang,Huiqi Hu,Zhihui Xue,Changcheng Chen,Yang Yu,Cuiyun Fu,Xuan Zhou,Feifei Li
DOI: https://doi.org/10.1109/ICDE51399.2021.00164
2021-01-01
Abstract:Cloud backup is becoming the preferred way for users to support disaster recovery. In addition to its convenience, users are deeply concerned about reducing storage costs in the face of large-scale backup data. Data deduplication is an effective method for backup storage. However, current deduplicate methods lack the utilization of cloud resources to provide scalable backup service for cloud backup users, and cannot meet the biased preference for different backup versions. For new backup versions, users want higher deduplicate and restore speed to reduce the waiting time. Conversely, reducing storage costs is more necessary for old backup versions. In this paper, we present SLIMSTORE, with a cloud-based deduplication architecture that disassembles the system into a storage layer and a computing layer to support elastic utilization of cloud resources. We propose two types of processing nodes with different design focuses to meet the needs of cloud-based backup. The L-node exploits locality and similarity, and adopts a history-aware strategy to provide fast online deduplication service. L-node also optimizes online restoration to realize high restore efficiency. Meanwhile, the G-node provides exact deduplication offline for the old versions, and helps the restore performance of the new versions by optimizing their physical storage. We compare SLIMSTORE with some state-of-art deduplicate and restore methods. Experimental results show that SLIMSTORE can achieve fast deduplication, efficient restoration, and effective space reduction. Furthermore, SLIMSTORE attains scalable deduplication and restoration.