Bandwidth-efficient Microburst Measurement in Large-scale Datacenter Networks

Kaihui Gao,Dan Li,Shuai Wang
DOI: https://doi.org/10.1145/3542637.3542640
2022-01-01
Abstract:Microburst measurement is essential for diagnosing and mitigating performance problems in datacenter networks. The key is to efficiently identify the flows that contribute the most to queue buildup. However, because the existing microburst measurement systems capture packet-level information, they incur significant bandwidth overhead. We present BurstScope, a bandwidth-efficient microburst measurement system that can profile the microburst characteristics and the contributing flows. BurstScope detects the microburst-involved packets in egress pipeline, then aggregates the measurement granularity from packet level to flow level by an invertible sketch. Finally, by carefully partitioning the measurement and statistic tasks between the data and control plane, we generate only one telemetry packet for each microburst. We have implemented BurstScope on Barefoot Tofino switches. Testbed-based evaluations show that BurstScope keeps low bandwidth overhead (< 0.02%) and high identification accuracy (> 97%). Compared with the state-of-the-art system, BurstScope can reduce 60 × bandwidth overhead.
What problem does this paper attempt to address?