2D-SAZD: A Novel 2D Coded Distributed Computing Framework for Matrix-Matrix Multiplication

Mingjun Dai,Zelong Zhang,Ziying Zheng,Zhonghao Zhang,Xiaohui Lin,Hui Wang
DOI: https://doi.org/10.1109/tsc.2024.3395931
IF: 11.019
2024-06-14
IEEE Transactions on Services Computing
Abstract:By separating huge dimensional matrix-matrix multiplication at a single computing node into parallel small matrix multiplications (with appropriate encoding) at parallel worker nodes, coded distributed computing (CDC) tackles the straggler problem and hence speeds up the computation significantly. Existing CDC encoding schemes are based on linear combination (LC), which have two drawbacks: First, heavy computational burden is introduced to both encoding and decoding phases. Second, large numerical error occurs in the decoding phase. To relieve these two effects, a fresh new 2D-SAZD-CDC framework that non-trivially generalizes 1D-SAZD-CDC is proposed, where D is short for dimension, the operation for encoding and decoding is implemented by shift-and-add (SA) and zigzag decoding (ZD) that replaces LC and matrix inversion, respectively. The non-trivial generalization lies in joint design of the operations in 2D are needed in both the encoding and the decoding phases, so as to ensure possesion of combination property (CP) and ZD from 2D viewpoint. More specifically, 2D-SA encoding is designed, 2D-ZD decoding (alternates intermittently between 2D) is proposed, and a proof for satisfying CP and ZD from 2D viewpoint is also given. Numerical studies show that 2D-SAZD-CDC significantly improves the numerical stability and computational load performance over existing LC based schemes.
computer science, information systems, software engineering
What problem does this paper attempt to address?