A Migratory Heterogeneity-Aware Data Layout Scheme for Parallel File Systems.

Shuibing He,Xian-He Sun,Yang Wang,Chengzhong Xu
DOI: https://doi.org/10.1109/ipdps.2018.00122
2018-01-01
Abstract:Parallel file systems (PFSs) are widely deployed to speed up the performance of high-performance computing (HPC) applications. In recent years, hybrid PFSs that consist of HDD-SSD servers, have attracted much attention in HPC community. However, existing data layout schemes do not well consider the characteristics of heterogeneous servers and heterogeneous access patterns, thus may experience considerable inefficiencies. In this study, we propose MHA, a migratory heterogeneity-aware data layout scheme to improve the data distribution of hybrid PFS. More specifically, to accommodate heterogeneous access patterns, MHA first migrates file data into several regions, each with similar access patterns. Then, by leveraging a data access cost model, MHA determines the appropriate stripe sizes on heterogeneous servers to get the best performance on each region. We have implemented MHA under MPI-IO library on top of OrangeFS file system. Experimental results show that MHA can significantly improve the hybrid PFS I/O system performance compared to existing data layout schemes.
What problem does this paper attempt to address?