Clustering based Probabilistic I/O Scheduling for Burst-Buffers Equipped HPC.

Benbo Zha,Hong Shen,Hankz Hankui Zhuo,Zhijian Luo
DOI: https://doi.org/10.1109/PAAP60200.2023.10391426
2023-01-01
Abstract:Modern High-Performance Computing (HPC) platforms usually consist of an intermediate high-throughput layer, Burst-Buffers (BBs), between computing nodes and underlying shared Parallel File System (PFS) to absorb the I/O bursts caused by concurrent I/O requests from different applications. As concurrent applications increase I/O demand, BBs may experience I/O contention due to its limited capacity. The existing probabilistic I/O scheduling method can schedule I/O under limited BBs’ capacity, which can sense BBs’ congestion via the Markov-Chain-based probability model. However, the probability model requires consistent I/O characteristics of applications, including similar I/O duration and longer application length, to obtain an accurate I/O load estimation. These consistency conditions do not often hold in realistic situations.In this paper, we proposed a probability I/O scheduling framework based on application clustering (PIOS) to eliminate the consistency requirement. The framework first clusters all applications by 1-D K-means according to their I/O phrase length. Next, the expected I/O workload of each cluster is calculated and then the BBs’s capacity is partitioned according to the expected I/O workload. Finally, the probabilistic I/O scheduling is applied to each application cluster. The simulation results demonstrate our framework can adapt to inconsistency and show more efficiency.
What problem does this paper attempt to address?