Towards Zero Copy Dataflows using RDMA

Bairen Yi,Jiacheng Xia,Li Chen,Kai Chen
DOI: https://doi.org/10.1145/3123878.3131975
2017-08-22
Abstract:Remote Direct Memory Access (RDMA) offers ultra-low latency and CPU bypass networking to application programmers. Existing applications are often designed around socket based software stack that manages application buffers separately from networking buffers and do memory copies between them when sending/receiving data. With large sized (up to hundreds MB) application buffers, the cost of such copies adds non trivial overhead to the end-to-end communication pipeline. In this work, we made an attempt to design a zero copy transport for distribute dataflow frameworks that unifies application and networking buffer management and completely eliminates unnecessary memory copies. Our prototype on top of TensorFlow shows 2.43x performance improvement over gRPC based transport and 1.21x performance improvement over an alternative RDMA transport with private buffers and memory copies.
What problem does this paper attempt to address?