Coded Computing for Multi-cluster Distributed Computations

Youlong Wu,Chenglin Li,Haoyang Hu,Xiyu Song,Shuai Ma,Yuanming Shi
DOI: https://doi.org/10.1109/tcomm.2024.3446641
IF: 6.166
2024-01-01
IEEE Transactions on Communications
Abstract:Distributed computing, which leverages distributed storage and computing resources, is a promising paradigm for handling large-scale computational tasks. However, its potential is often hindered by high communication latency due to limited network bandwidth. In this paper, we study the computation-communication tradeoff of multi-cluster MapReduce systems where a central server connects to multiple clusters, each comprising a set of workers that jointly perform a MapReduce task. Workers can exchange information directly within their cluster (inner-cluster communication) or indirectly through the central server (cross-cluster communication). To reduce the communication load, we propose a nested coded distributed computing (CDC) scheme that is feasible for the heterogeneous scenario where different clusters could have arbitrary numbers of workers and computation loads. It is shown that our scheme can greatly reduce communication load compared to all existing schemes, and could achieve the optimal cross-cluster communication load. In addition, the proposed scheme can significantly reduce the computational complexity of the conventional CDC schemes, whose computational complexity exponentially increases with the computation load.
What problem does this paper attempt to address?