Multi-layer Adaptive Sampling for Per-Flow Spread Measurement
Boyu Zhang,Yang Du,He Huang,Yu-E Sun,Guoju Gao,Xiaoyu Wang,Shiping Chen
DOI: https://doi.org/10.1007/978-3-030-95384-3_46
2022-01-01
Abstract:Per-flow spread measurement in high-speed networks, which aims to estimate the number of distinct elements of each flow, plays an important role in many practical applications. Most existing solutions adopt compact data structures (i.e., sketches) to share memory units among flows so that they can fit in limited on-chip memory, resulting in low estimation accuracy for small flows. Unlike sketch-based solutions, non-duplicate sampling measures per-flow spreads by sampling each distinct element with the same sampling probability. However, it ignores that, compared to small flows, large flows only need lower sampling probabilities to achieve the same relative estimation error, wasting significant on-chip memory for large flows. This paper presents multi-layer adaptive sampling to complement the prior work by assigning lower probabilities to larger flows. The proposed framework employs a multi-layer model to sample distinct elements, ensuring that most small flows will stay in lower layers and large flows will get to higher layers. Besides, higher layers are designed with smaller overall probabilities to ensure that larger flows have lower sampling probabilities. Experimental results based on real Internet traces show that, compared to the state-of-the-art method, our solution can reduce up to 86% average relative errors for per-flow spread estimation and reduce the FPRs and FNRs of flow misclassification by around one to two magnitudes.