SketchINT: Empowering INT with TowerSketch for Per-Flow Per-Switch Measurement

Kaicheng Yang,Sheng Long,Qilong Shi,Yuanpeng Li,Zirui Liu,Yuhan Wu,Tong Yang,Zhengyi Jia
DOI: https://doi.org/10.1109/tpds.2023.3303924
IF: 5.3
2023-01-01
IEEE Transactions on Parallel and Distributed Systems
Abstract:Network measurement is indispensable to network operations. INT solutions that can provide fine-grained per-switch per-packet information serve as promising solutions for per-flow per-switch measurement. The main shortcoming of INT is its high network overhead incurred by collecting INT information, making INT impractical for production deployment. Sketches that can compactly record per-flow information with small memory footprint, are a promising choice for compressing INT information to reduce INT overhead. An ideal sketch for efficiently compressing INT information in practice should achieve both simplicity and accuracy, but no existing sketch achieves both. Motivated by this, we first design SketchINT to combine INT and sketches, aiming to obtain all per-flow per-switch information with low network overhead. Second, we design a new sketch for SketchINT, namely TowerSketch, which achieves both simplicity and accuracy. The key idea of TowerSketch is to use different-sized counters for different arrays under the property that the number of bits used for different arrays stays the same. TowerSketch can automatically record larger flows in larger counters and smaller flows in smaller counters. To further ease the configuration and give network operators more confidence on performance of TowerSketch, we propose a method for precise error bound estimation. We have fully implemented our SketchINT prototype on a testbed consisting of 10 switches. We also implement our TowerSketch on P4, single-core CPU, multi-core CPU, and FPGA platforms to verify its deployment flexibility. Extensive experimental results verify that 1) TowerSketch achieves better accuracy than prior art on various tasks, outperforming the state-of-the-art ElasticSketch up to 27.7 times in terms of error; 2) Compared to INT, SketchINT reduces the number of packets belonging to the control plane overhead by $3 \sim 4$ orders of magnitude with an error smaller than 5%; 3) The estimated error bound of TowerSketch can almost match the actual error bound.
What problem does this paper attempt to address?