VarFS: A Variable-sized Objects Based Distributed File System

Yili Gong,Yanyan Xu,Yingchun Lei,Wenjie Wang
DOI: https://doi.org/10.1109/hpcc-css-icess.2015.54
2015-01-01
Abstract:Cloud-based file systems are widely accepted and adopted for personal and business purposes in recent years. Statistics shows that approximately 25% of file operations from a typical user are random writes. Inherited from traditional disk-based file systems, most distributed file systems are also based on objects or chunks of fixed sizes, which work well for sequential writes but poorly for random writes. This paper investigates the design paradigm of variable-sized objects for a distributed file system. A novel distributed file system named VarFS, is presented to incorporate variable object indexing and support random write operations. VarFS reduces the amount of unnecessary data being read and the number of objects modified in face of updates and consequently alleviates the total amount of data transferred. The implementation is based on Ceph and the performance measurements show that it can achieve 1-2 orders of magnitude less latency than Ceph on random writes. At the same time the overhead for initial writes and re-writes is acceptable.
What problem does this paper attempt to address?