Space- and Computationally-Efficient Set Reconciliation via Parity Bitmap Sketch (PBS)
Long Gong,Ziheng Liu,Liang Liu,Jun Xu,Mitsunori Ogihara,Tong Yang
DOI: https://doi.org/10.14778/3436905.3436906
IF: 2.5
2020-01-01
Proceedings of the VLDB Endowment
Abstract:Set reconciliation is a fundamental algorithmic problem that arises in many networking, system, and database applications. In this problem, two large sets A and B of objects (bitcoins, files, records, etc.) are stored respectively at two different network-connected hosts, which we name Alice and Bob respectively. Alice and Bob communicate with each other to learn A Delta B, the difference between A and B, and as a result the reconciled set A boolean OR B. Current set reconciliation schemes are based on either invertible Bloom filters (IBF) or error-correction codes (ECC). The former has a low computational complexity of O(d), where d is the cardinality of A Delta B, but has a high communication overhead that is several times larger than the theoretical minimum. The latter has a low communication overhead close to the theoretical minimum, but has a much higher computational complexity of O(d(2)). In this work, we propose Parity Bitmap Sketch (PBS), an ECC-based set reconciliation scheme that gets the better of both worlds: PBS has both a low computational complexity of O(d) just like IBF-based solutions and a low communication overhead of roughly twice the theoretical minimum. A separate contribution of this work is a novel rigorous analytical framework that can be used for the precise calculation of various performance metrics and for the near-optimal parameter tuning of PBS.