Thallus: An RDMA-based Columnar Data Transport Protocol

Jayjeet Chakraborty,Matthieu Dorier,Philip Carns,Robert Ross,Carlos Maltzahn,Heiner Litz
2024-12-03
Abstract:The volume of data generated and stored in contemporary global data centers is experiencing exponential growth. This rapid data growth necessitates efficient processing and analysis to extract valuable business insights. In distributed data processing systems, data undergoes exchanges between the compute servers that contribute significantly to the total data processing duration in adequately large clusters, necessitating efficient data transport protocols. Traditionally, data transport frameworks such as JDBC and ODBC have used TCP/IP-over-Ethernet as their underlying network protocol. Such frameworks require serializing the data into a single contiguous buffer before handing it off to the network card, primarily due to the requirement of contiguous data in TCP/IP. In OLAP use cases, this serialization process is costly for columnar data batches as it involves numerous memory copies that hurt data transport duration and overall data processing performance. We study the serialization overhead in the context of a widely-used columnar data format, Apache Arrow, and propose leveraging RDMA to transport Arrow data over Infiniband in a zero-copy manner. We design and implement Thallus, an RDMA-based columnar data transport protocol for Apache Arrow based on the Thallium framework from the Mochi ecosystem, compare it with a purely Thallium RPC-based implementation, and show substantial performance improvements can be achieved by using RDMA for columnar data transport.
Distributed, Parallel, and Cluster Computing,Databases,Operating Systems
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the low data - transfer performance in modern distributed data - processing systems due to the serialization overhead during data - transfer. Specifically: 1. **Data Growth and Processing Requirements**: The amount of data generated and stored in contemporary global data centers is growing exponentially, which requires efficient processing and analysis to extract valuable information. In distributed data - processing systems, data is exchanged between computing servers, which has a significant impact on the total data - processing time, especially in large clusters. 2. **Limitations of Traditional Transmission Frameworks**: Traditional data - transmission frameworks (such as JDBC and ODBC) use TCP/IP - over - Ethernet as the underlying network protocol. These frameworks require data to be serialized into a continuous buffer before being handed over to the network card, mainly because TCP/IP requires data to be continuous. For columnar data formats (such as Apache Arrow), this serialization process involves a large amount of memory copying, thus increasing data - transfer time and the overall loss of data - processing performance. 3. **Impact of Serialization Overhead**: Especially in OLAP (Online Analytical Processing) use cases, the serialization process is very costly for columnar data batches because it involves multiple memory copies. This is especially expensive for cross - host transmissions because even Apache Arrow, which is zero - copy within the same host, requires expensive serialization operations when transmitted across hosts. To solve these problems, the author proposes a columnar data - transfer protocol called Thallus based on RDMA (Remote Direct Memory Access) for zero - copy transmission of Apache Arrow data via Infiniband. By using RDMA, the serialization overhead can be avoided, thereby significantly improving data - transfer performance and query - execution speed. ### Main Improvement Points: - **Reduce Serialization Overhead**: By directly transmitting data via RDMA, the serialization and deserialization steps in traditional transmission methods are avoided. - **Improve Transmission Performance**: Experiments show that the speed of data - transfer using Thallus is 5.5 times faster than pure Thallium RPC, and the end - to - end query - execution performance is improved by 2.5 times. - **Wide Applicability**: It is suitable for various OLAP workloads, demonstrating the feasibility and superior performance of RDMA in modern data - processing applications. ### Summary: The paper aims to accelerate the transmission of columnar data in distributed data - processing systems by designing and implementing the Thallus protocol and using RDMA, thereby effectively reducing serialization overhead and improving the overall performance of data - transfer and query - execution.