Exploiting Network Loss for Distributed Approximate Computing with NetApprox

Ke Liu,Jinmou Li,Shin-Yeh Tsai,Theophilus Benson,Yiying Zhang
DOI: https://doi.org/10.48550/arXiv.1901.01632
2022-06-30
Abstract:Many data center applications such as machine learning and big data analytics can complete their analysis without processing the complete set of data. While extensive approximate-aware optimizations have been proposed at hardware, programming language, and application levels. However, to date, the approximate computing optimizations have ignored the network layer. We propose NetApprox, which to the best of our knowledge, is the first approximate-aware network layer comprising transport-layer protocol, network resource allocation schemes, and scheduling/priority-assignment policies. Building on the observation that approximate applications can tolerate loss, NetApprox's main insights are to aggressively send approximate traffic (which improves the performance of approximate applications) and to minimize the network resources allocated to approximate traffic (which simultaneously limits the impact of aggressive approximate traffic while freeing up resources that, in turn, improve non-approximate applications' performance). We ported Flink, Kafka, Spark, and PyTorch to NetApprox and evaluated NetApprox with both large-scale simulation and real implementation. Our evaluation results show that NetApprox improves job completion times by up to 80% compared to network-oblivious approximation solutions, and improves the performance of co-running non-approximate workloads by 79%.
Networking and Internet Architecture
What problem does this paper attempt to address?