Performance-Aware Data Placement in Hybrid Parallel File Systems.

Shuibing He,Xian-He Sun,Bo Feng,Kun Feng
DOI: https://doi.org/10.1007/978-3-319-11197-1_43
2014-01-01
Abstract:Hybrid parallel file systems (PFS), which consist of both HDD and SSD servers, provide a promising solution for data-intensive applications. In this study, we propose a performance-aware data placement (PADP) strategy to enable efficient data layout in hybrid PFSs. The basic idea of PADP is to dispatch data on different file servers with adaptive varied-size file stripes based on the server storage performance. By using an effective data access cost model and a linear programming optimization method, the appropriate stripe sizes for each file server are determined effectively. We have implemented PADP within OrangeFS, a widely used parallel file system in HPC domain. Experimental results of representative benchmark show that PADP can significantly improve the I/O performance of hybrid PFSs.
What problem does this paper attempt to address?