Towards In-network Acceleration of Erasure Coding

Yi Qiao,Xiao Kong,Menghao Zhang,Yu Zhou,Mingwei Xu,Jun Bi
DOI: https://doi.org/10.1145/3373360.3380833
2020-01-01
Abstract:In distributed storage systems, erasure coding (EC) is a crucial technology to enable high fault tolerance with lower storage overheads than data replication. EC can reconstruct missing data by downloading parity data from survived machines. However, downloading streams of EC multiplex the available network I/O on the receiving end, leading to a substantially low data reconstruction speed. In this paper, we present NetEC, a novel in-network accelerating system that fully offloads EC to programmable switching ASICs. NetEC prevents multiplexing network I/O through on-switch downloading stream aggregation, thus significantly improving reconstruction speed. NetEC addresses three key challenges: computation offloading of complex EC operations, rate synchronization of multiple downloading streams, and deep payload inspection/assembly. We implement NetEC on hardware programmable switches. Evaluation shows that compared to HDFS-EC, NetEC significantly improves reconstruction rate by 2.7x-9.0x and eliminates CPU overheads, with low switch memory usage.
What problem does this paper attempt to address?