Boosting Parallel File System Performance Via Heterogeneity-Aware Selective Data Layout.

Shuibing He,Yang Wang,Xian-He Sun
DOI: https://doi.org/10.1109/tpds.2015.2504969
IF: 5.3
2015-01-01
IEEE Transactions on Parallel and Distributed Systems
Abstract:Hybrid parallel file systems (PFS) that combine HDD servers with SSD servers provide a promising solution for data intensive applications. The efficiency of a hybrid PFS relies on the data layout schemes. However, most current layout strategies are designed for homogeneous servers, which neither address the heterogeneity of servers nor the varying access patterns of applications. In this paper, we propose HAS, a novel heterogeneity-aware selective data layout scheme for hybrid PFSs. HAS alleviates inter-server load imbalance through skewing data distribution on heterogeneous servers based on their storage performance. Furthermore, to obtain the optimal performance for a specific access pattern, HAS selects one static data layout policy with lowest access cost from three typical layout candidates as the final file data layout method. To adapt to the mixed access patterns within an application, HAS uses a dynamic data layout scheme, which stores file with multiple copies, each using a different data layout policy, and then selects the copy with the lowest access cost to serve file requests. We have implemented HAS within MPICH2 and OrangeFS. Experimental results show that HAS can significantly increase the I/O throughput of hybrid PFSs, compared to existing data layout optimization methods.
What problem does this paper attempt to address?