TCP PLATO: Packet Labelling to Alleviate Time-Out

Shikhar Shukla,Shingau Chan,Adrian S. -W. Tam,Abhishek Gupta,Yang Xu,Jonathan Chao
DOI: https://doi.org/10.1109/jsac.2014.140107
IF: 16.4
2014-01-01
IEEE Journal on Selected Areas in Communications
Abstract:Many applications (e.g., cluster based storage and MapReduce) in modern data centers require a high fan-in, many-to-one type of data communication (known as TCP incast), which could cause severe incast congestion in switches and result in TCP goodput collapse, substantially degrading the application performance. The root cause of such a collapse is the long idle period of the Retransmission Timeout (RTO) that is triggered at one or more senders by packet losses in congested switches. In this paper we develop a packet labelling scheme PLATO, which improves the loss detection capabilities of NewReno using an innovative packet labelling system. Packets carrying this special label are preferentially enqueued, at the switch. This allows TCP to detect packet loss using three duplicate acknowledgements, instead of the time expensive RTO; thus avoiding the goodput collapse. PLATO makes minor modifications to NewReno and does not alter its congestion control mechanism. The implementation and simulations have been done in Network Simulator 3 (NS3). PLATO's performance is significantly better than NewReno as well as state-of-art incast solutions Incast Control TCP (ICTCP) and Data Center TCP (DCTCP). We also show that TCP PLATO can be implemented using commodity switches with Weighted Random Early Detection (WRED) function.
What problem does this paper attempt to address?