I/O Bottleneck Detection and Tuning: Connecting the Dots using Interactive Log Analysis

Jean Luca Bez,Houjun Tang,Bing Xie,David Williams-Young,Rob Latham,Rob Ross,Sarp Oral,Suren Byna
DOI: https://doi.org/10.1109/pdsw54622.2021.00008
2021-11-01
Abstract:Using parallel file systems efficiently is a tricky problem due to inter-dependencies among multiple layers of I/O software, including high-level I/O libraries (HDF5, netCDF, etc.), MPI-IO, POSIX, and file systems (GPFS, Lustre, etc.). Profiling tools such as Darshan collect traces to help understand the I/O performance behavior. However, there are significant gaps in analyzing the collected traces and then applying tuning options offered by various layers of I/O software. Seeking to connect the dots between I/O bottleneck detection and tuning, we propose DXT Explorer, an interactive log analysis tool. In this paper, we present a case study using our interactive log analysis tool to identify and apply various I/O optimizations. We report an evaluation of performance improvement achieved for four I/O kernels extracted from science applications.
What problem does this paper attempt to address?