A cost-aware region-level data placement scheme for hybrid parallel I/O systems

Shuibing He,Xian-He Sun,Bo Feng,Xin Huang,Kun Feng
DOI: https://doi.org/10.1109/CLUSTER.2013.6702615
2013-01-01
Abstract:Parallel I/O systems represent the most commonly used engineering solution to mitigate the performance mismatch between CPU and disk performance; however, parallel I/O systems are application dependent and may not work well for certain data access requests. New emerging solid state drives (SSD) are able to deliver better performance but incur a high monetary cost. While SSDs cannot always replace HDDs, the hybrid SSD-HDD approach uniquely addresses common performance issues in parallel I/O systems. The performance of hybrid SSD-HDD architecture depends on the utilization of the SSD and scheduling of data placement. In this paper, we propose a cost-aware region-level (CARL) data placement scheme for hybrid parallel I/O systems. CARL divides large files into several small regions, calculates the region costs according to the data access patterns, and selectively places regions with high access costs onto the SSD-based file servers. We have implemented CARL under MPI-IO and the PVFS2 parallel file system environment. Experimental results of representative benchmarks show that CARL is both feasible and able to improve I/O performance significantly.
What problem does this paper attempt to address?