Enabling Efficient Random Access to Hierarchically-Compressed Data

Feng Zhang,Jidong Zhai,Xipeng Shen,Onur Mutlu,Xiaoyong Du
DOI: https://doi.org/10.1109/icde48307.2020.00097
2020-01-01
Abstract:Recent studies have shown the promise of direct data processing on hierarchically-compressed text documents. By removing the need for decompressing data, the direct data processing technique brings large savings in both time and space. However, its benefits have been limited to data traversal operations; for random accesses, direct data processing is several times slower than the state-of-the-art baselines. This paper presents a set of techniques that successfully eliminate the limitation, and for the first time, establishes the feasibility of effectively handling both data traversal operations and random data accesses on hierarchically-compressed data. The work yields a new library, which achieves 3.1 × speedup over the state-of-the-art on random data accesses to compressed data, while preserving the capability of supporting traversal operations efficiently and providing large (3.9 ×) space savings.
What problem does this paper attempt to address?