CFS: Scaling Metadata Service for Distributed File System Via Pruned Scope of Critical Sections
Yiduo Wang,Yufei Wu,Cheng Li,Pengfei Zheng,Biao Cao,Yan Sun,Fei Zhou,Yinlong Xu,Yao Wang,Guangjun Xie
DOI: https://doi.org/10.1145/3552326.3587443
2023-01-01
Abstract:There is a fundamental tension between metadata scalability and POSIX semantics within distributed file systems. The bottleneck lies in the coordination, mainly locking, used for ensuring strong metadata consistency, namely, atomicity and isolation. CFS is a scalable, fully POSIX-compliant distributed file system that eliminates the metadata management bottleneck via pruning the scope of critical sections for reduced locking overhead. First, CFS adopts a tiered metadata organization to scale file attributes and the remaining namespace hierarchies independently with appropriate partitioning and indexing methods, eliminating cross-shard distributed coordination. Second, it further scales up the single metadata shard performance by single-shard atomic primitives , shortening the metadata requests' lifespan and removing spurious conflicts. Third, CFS drops the metadata proxy layer but employs the light-weight, scalable client-side metadata resolving. CFS has been running in the production environment of Baidu AI Cloud for three years. Our evaluation with a 50-node cluster and microbenchmarks shows that CFS simultaneously improves the throughput of baselines like HopsFS and InfiniFS by 1.76--75.82× and 1.22--4.10×, and reduces their average latency by up to 91.71% and 54.54%, respectively. Under cases with higher contention and larger directories, CFS' throughput benefits expand by one order of magnitude. For three real-world workloads with data accesses, CFS introduces 1.62--2.55× end-to-end throughput speedups and 35.06--62.47% tail latency reductions over InfiniFS.