KTSketch: Finding K-Persistent T-Spread Flows in High-Speed Networks
Shaolong Zhou,Hanwen Zhang,Guoju Gao,Yu-e Sun,He Huang,Xiaoyu Wang,Yihuai Wang
DOI: https://doi.org/10.1007/978-981-97-7241-4_21
2024-01-01
Abstract:Finding flows with a large spread (i.e., the number of distinct elements) in multiple periods has many practical applications, such as detecting network scanners, click fraud, and DDoS attacks, etc.. Most previous works focus on finding persistent flows or estimating flow spreads. To the best of our knowledge, there is no previous work to find the flows with spread exceeding the threshold frequently. In this paper, we study a new problem called finding k-persistent t-spread flows in highspeed networks, which detects the flows whose spreads exceed a preset threshold t during at least inverted right perpendicular k * T inverted left perpendicular out of T measurement periods, where k is an element of (0, 1], and the parameters can be arbitrarily defined. To this end, we propose a novel sketch, called KTSketch, to find k-persistent t-spread flows in real time. The key idea of KTSketch is first to find out those flows whose spread exceeds the threshold t in the current period as potential flows and then separately estimate their spreads to find k-persistent t-spread flows. We compare the performance of KTSketch with three baseline solutions (Bloom Filter+Count-Min, VHLL, and FreeBS-SSD). The experimental results show that KTSketch can achieve around 49.4%, 59.0%, 27.1% higher F1 score, 3.89, 2.13, 13.8 times higher throughput, and 43.1%, 81.2%, 72.2% lower ARE, respectively.