PTCP: A Priority-Based Transport Control Protocol for Timeout Mitigation in Commodity Data Center.
Chang Ruan,Jianxin Wang,Wanchun Jiang,Geyong Min,Yi Pan
DOI: https://doi.org/10.1016/j.future.2019.08.036
IF: 7.307
2019-01-01
Future Generation Computer Systems
Abstract:In data centers, the occurrence of timeout for TCP may hurt its data transmission performance dramatically, causing problems like TCP Incast, TCP Outcast and long query completion time. To mitigate timeouts, the transport protocol should try to maintain a small switch queue to avoid the packet loss and recover lost packets quickly. Recent work suggests using Explicit Congestion Notification (ECN), Round Trip Time (RTT) or the in-network signal to achieve that. However, these solutions either still suffer from many timeouts when the number of concurrent flows becomes larger or require the nontrivial hardware support. The limitations motivate us to design a Priority-based Transport Control Protocol termed PTCP to mitigate timeouts as far as possible for commodity data center. The key idea of PTCP is that it inserts a high priority packet following each window of data packets. The key insight is that since the priorities of data packets and the inserted packet are different, they may arrive at the receiver in different sequences depending on the network congestion. By checking the sequences of the received ACKs of the two kinds of packets, PTCP can infer the network congestion to guide the fine adjustment of its sending window such that the switch buffer occupation is kept small. Additionally, by keeping the high priority packet always in flight, PTCP could determine to retransmit the possible lost data packets quickly. With the two mechanisms, PTCP significantly alleviates timeouts even when the number of concurrent flows becomes large. Furthermore, PTCP only requires the priority queuing function of switch, which is available in existing commodity switch hardware. Thus, it does not require the hardware modification. Extensive performance evaluation is conducted to demonstrate that PTCP has zero timeout and better performance for problems like TCP Incast, TCP Outcast and long query completion time compared with several state-of-the-art protocols.