HSM2 - A Hybrid and Scalable Metadata Management Method in Distributed File Systems.

Yiduo Wang,Youxu Chen,Xinyang Shao,Jinzhong Chen,Liu Yuan,Yinlong Xu
DOI: https://doi.org/10.1007/978-981-15-2767-8_19
2019-01-01
Abstract:In the bigdata era, metadata performance is critical in modern distributed file systems. Traditionally, the metadata management strategies like the subtree partitioning method focus on keeping namespace locality, while the other ones like the hash-based mapping method aim to offer good load balance. Nevertheless, none of these methods achieve the two desirable properties simultaneously. To close this gap, in this paper, we propose a novel metadata management scheme, HSM\(^{2}\), which combines the subtree partitioning and hash-based mapping method together. We implemented HSM\(^{2}\) in CephFS, a widely deployed distributed file systems, and conducted a comprehensive set of metadata-intensive experiments. Experimental results show that HSM\(^{2}\) can achieve better namespace locality and load balance simultaneously. Compared with CephFS, HSM\(^{2}\) can reduce the completion time by 70% and achieve 3.9\(\times \) overall throughput speedup for a file-scanning workload.
What problem does this paper attempt to address?