Rdcm: Reliable Data Center Multicast

Dan Li,Mingwei Xu,Ming-chen Zhao,Chuanxiong Guo,Youngguang Zhang,Min-you Wu
DOI: https://doi.org/10.1109/infcom.2011.5935228
2011-01-01
Abstract:Multicast benefits data center group communication in both saving network traffic and improving application throughput. The SLA (Service Level Agreement) of cloud service requires the computation correctness of distributed applications, translating to the requirement of reliable Multicast delivery. In this paper we present RDCM, a novel reliable Multicast approach for data center network. The key idea of RDCM is to minimize the impact of packet loss on the Multicast performance, by leveraging the rich link resource in data centers. A Multicast-tree-aware backup overlay is purposely built on group members for peer-to-peer packet repair. Riding on Unicast, packet repair not only achieves complete repair isolation, but also has high probability to bypass the pathological links in the Multicast tree where packet loss occurs. The backup overlay is organized in such a way that it causes little individual repair burden, control overhead, as well as overall repair traffic. We have implemented RDCM as a user-level library on Windows platform. The experiments on our test bed show that RDCM handles packet loss without obvious throughput degradation during high-speed data transmission.
What problem does this paper attempt to address?