On the Load Imbalance Problem of I/O Forwarding Layer in HPC Systems

Jie Yu,Guangming Liu,Wenrui Dong,Xiaoyong Li,Jian Zhang,Fuxing Sun
DOI: https://doi.org/10.1109/compcomm.2017.8322970
2017-01-01
Abstract:As the computing capability of top HPC systems increases towards exascale, the I/O traffic from tremendous amount of compute nodes have stretched underlying storage system to its limit. I/O forwarding was proposed to address such problem by reducing the number of clients of storage system to a much smaller number of I/O nodes. In this paper, we study the load imbalance problem of I/O forwarding layer, and find that the bursty I/O traffic of HPC applications and the commonly existing rank 0 I/O pattern make the load on I/O nodes highly unbalanced. The application performance is limited if some of the I/O nodes become hot spots while others have little workload to manage. We propose to apportion the heavy I/O workloads of a single I/O node to multiple idle I/O nodes to alleviate the load imbalance. We implement our ideas on an open-sourced I/O forwarding software IOFSL. The preliminary results indicate that our approach can accelerate I/O performance greatly.
What problem does this paper attempt to address?