Extending TCP for Accelerating Replication on Cluster File Systems over SDNs

Sungheon Lim,Hyogon Kim
DOI: https://doi.org/10.48550/arXiv.1812.10584
2019-03-07
Abstract:This paper explores the changes required of TCP to efficiently support cluster file systems such as Hadoop Distributed File System (HDFS) where the storage nodes are connected through a software defined networking (SDN). Traditional chain replications in these file systems incur large delay and cause inefficient network use. But SDN can cooperate with the cluster file systems to address the problems by pre-arranging a distribution tree, which opens the possibility of parallel replication. Unfortunately, it cannot be realized without extending TCP, to accommodate the parallel transfer on the transport layer. This paper discusses how to extend TCP to make it possible, and demonstrates the feasibility by implementing a prototype in the Linux kernel. The prototype saves the data replication time by 25% while substantially reducing network use.
Networking and Internet Architecture
What problem does this paper attempt to address?