Accelerating Lossy Compression on HPC Datasets Via Partitioning Computation for Parallel Processing

Xiangyu Zou,Tao Lu,Sheng Di,Dingwen Tao,Wen Xia,Xuan Wang,Weizhe Zhang,Qing Liao
DOI: https://doi.org/10.1109/hpcc/smartcity/dss.2019.00246
2019-01-01
Abstract:Recently, increasing attention has been paid to data reduction in the high-performance computing (HPC) environment where a large volume of data are produced continually during scientific simulations. A approach called SZ lossy compressor has been one of the best choices for HPC data reduction due to its high compression ratios while meeting data precision requirements. Currently, high compression rate is also strongly demanded because of fairly high data production throughput of many applications. In this work, we aim to accelerate the SZ compressor significantly by developing a parallel model in terms of the wildly-used point-wise relative error bound. It is non-trivial to parallelize SZ because of the strong data dependency in SZ. To address this issue, we develop a pipeline-like method and exploit a series of strategies to parallelize the 'logarithmic transformation' and 'prediction + quantization' stages for SZ. Our evaluation with real-world scientific simulation datasets shows that our design can accelerate the compression rate by over 2.0× in most cases while still guaranteeing the same compression ratio as the original serial version of SZ.
What problem does this paper attempt to address?