Achieving Efficient Scheduling Based on Accurate Measurement of Small Flows in Data Center

Jiawei Huang,Qile Wang,Zhaoyi Li,Yijun Li,Zihao Chen,Sitan Li,Jing Shao,Jingling Liu,Min Zhan,Jianxin Wang
DOI: https://doi.org/10.1145/3673038.3673086
2024-01-01
Abstract:In modern data centers, many flow scheduling schemes are proposed to accelerate data transfer and improve user experience. However, these schemes assume ideally the prior knowledge of the flow size information, which, unfortunately, is hard to obtain without modifying data center applications. The sketch-based approaches measure the flow size at switch with a compact memory structure, high throughput, and acceptable accuracy loss. However, existing sketches commonly focus on large or specific flows, while most flows in data center networks are small, resulting in missing or overestimated size information about small flows. We propose Strainer Sketch, which enables accurate and fast measurement of small flows with small memory and flexible deployment in a variety of scheduling algorithms. Specifically, Strainer Sketch uses the hierarchical structure to mitigate hash collisions between large and small flows, and the probabilistic counting algorithm to mitigate overestimation due to hash collisions between small flows. Furthermore, we propose a packet scheduling algorithm SW-PIFO, which provides the flow discrimination for a huge number of small flows by using a limited number of queues. Through the testbed experiments and simulations of typical data center applications, we show that our scheme reduces the small flow completion time (FCT) by up to 56.7Math 1 compared with flow scheduling using classic sketches.
What problem does this paper attempt to address?