DI-Tree: A Dual-ended Interval Tree for Efficient Event Matching in Content-based Pub/Sub Systems
Junshen Li,Haiyang Ren,Zhengyu Liao,Wanghua Shi,Shiyou Qian,Guangtao Xue,Jian Cao,Zhonglong Zheng
DOI: https://doi.org/10.1109/icpads63350.2024.00068
2024-01-01
Abstract:Content-based publish/subscribe systems have the capability to achieve fine-grained data distribution, rapidly forwarding data from publishers to subscribers with specific requirements. The event matching algorithm is a fundamental component, quickly searching for subscriptions that match an event based on constraints defined by subscribers. As data scales continue to expand, there is a heightened demands for more efficient, robust, and versatile event matching algorithms. In this paper, we propose a novel data structure called the Dual-ended Interval Tree (DI-Tree). Firstly, given the splitting point, this data structure classifies intervals into three categories based on the joint distribution of their left and right endpoints in the attribute value domain. Furthermore, the DI-Tree utilizes blue and green nodes to store these three categories of intervals, resulting in enhanced indexing efficiency. Moreover, by utilizing the DITree to efficiently search matching and unmatching intervals, we develop innovative forward and backward event matching algorithms. Additionally, to enhance matching efficacy and reduce memory usage, we introduce key optimization techniques, focusing on improving node balance and bitset optimization. We conduct extensive experiments to evaluate the performance of DITree. When compared with five state-of-the-art event matching algorithms, the DI-Tree demonstrates an average reduction of up to 76.8 % in terms of matching time. This significant improvement highlights the effectiveness of our proposed strategies and the potential of DI-Tree in optimizing event matching performance.