Micro-Burst Aware ECN in Multi-Queue Data Centers: Algorithm and Implementation

Jinghui Zhang,Dian Shen,Fang Dong,Kexi Kang,Jiahui Jin,Junzhou Luo,Junxue Zhang
DOI: https://doi.org/10.1109/tnse.2023.3271869
IF: 6.6
2024-01-01
IEEE Transactions on Network Science and Engineering
Abstract:With the development of multi-queue data centers, various efforts have leveraged Explicit Congestion Notification (ECN) to achieve high throughput and low latency for data communication. However, one of the deep-seated problems is that the micro-burst traffic in networks could cause the instantaneous queue length to exceed the ECN threshold, leading to significant mismarkings of ECN. Such mismarkings could further lead to severe network performance degradation. In this paper, we propose Micro-Burst aware ECN (MBECN+) to mitigate this issue. MBECN+ is essentially a novel scheme that decouples ECN threshold setting and ECN marking. First, it finds a more appropriate ECN threshold on a per-queue basis to eliminate the spurious congestion signals caused by micro burst traffic. Second, MBECN+ consists of a double-threshold ECN marking scheme that it marks the packets with ECN in a finer grain. Considering the queue backlog caused by micro-burst traffic, it marks the packets when they are dequeuing, instead of enqueuing. Furthermore, we analyze the feasibility of implementing MBECN+ on commodity layer-3 multi-queue switches. Through testbed experiments and large scale simulations, we demonstrate that MBECN+ can improve the throughput by up to $\sim$ 20% and reduce FCT (flow completion time) by up to $\sim$ 40%. The throughput under MBECN+ improves by 1.5 $\sim 2.4\times$ than DCTCP and 1.26 $\sim 1.35\times$ than ECN*.
What problem does this paper attempt to address?