HAS: Heterogeneity-Aware Selective Data Layout Scheme for Parallel File Systems on Hybrid Servers

Shuibing He,Xian-He Sun,Adnan Haider
DOI: https://doi.org/10.1109/IPDPS.2015.23
2015-01-01
Abstract:Hybrid parallel file systems (PFS), consisting of multiple HDD and SSD I/O servers, provide a promising design for data intensive applications. The efficiency of a hybrid PFS relies on the file's data layout. However, most current layout strategies are designed and optimized for homogeneous servers. Using them directly in a hybrid PFS neither addresses the heterogeneity of servers nor the varying access patterns of applications, making hybrid PFSs disappointingly inefficient. In this paper, we propose HAS, a novel heterogeneity-aware selective data layout scheme for hybrid PFSs. HAS alleviates the inter-server load imbalance through skewing data distribution on heterogeneous servers based on their storage performance. To largely improve the entire system's I/O efficiency, HAS adaptively selects the optimal data layout from three typical candidates according to the application's data access patterns, based on a newly developed selection and distribution algorithm. We have implemented HAS within OrangeFS to provide efficient data distribution for data-intensive applications. Our extensive experiments validate that HAS significantly increases the I/O throughput of hybrid PFSs, compared to existing data layout optimization methods.
What problem does this paper attempt to address?