SDRBench: Scientific Data Reduction Benchmark for Lossy Compressors

Kai Zhao,Sheng Di,Xin Liang,Sihuan Li,Dingwen Tao,Julie Bessac,Zizhong Chen,Franck Cappello
DOI: https://doi.org/10.48550/arXiv.2101.03201
2021-01-08
Distributed, Parallel, and Cluster Computing
Abstract:Efficient error-controlled lossy compressors are becoming critical to the success of today's large-scale scientific applications because of the ever-increasing volume of data produced by the applications. In the past decade, many lossless and lossy compressors have been developed with distinct design principles for different scientific datasets in largely diverse scientific domains. In order to support researchers and users assessing and comparing compressors in a fair and convenient way, we establish a standard compression assessment benchmark -- Scientific Data Reduction Benchmark (SDRBench). SDRBench contains a vast variety of real-world scientific datasets across different domains, summarizes several critical compression quality evaluation metrics, and integrates many state-of-the-art lossy and lossless compressors. We demonstrate evaluation results using SDRBench and summarize six valuable takeaways that are helpful to the in-depth understanding of lossy compressors.
What problem does this paper attempt to address?