Algebraic Geometry Codes for Distributed Matrix Multiplication Using Local Expansions

Jiang Li,Songsong Li,Chaoping Xing
2024-08-03
Abstract:Code-based Distributed Matrix Multiplication (DMM) has been extensively studied in distributed computing for efficiently performing large-scale matrix multiplication using coding theoretic techniques. The communication cost and recovery threshold (i.e., the least number of successful worker nodes required to recover the product of two matrices) are two major challenges in coded DMM research. Several constructions based on Reed-Solomon (RS) codes are known, including Polynomial codes, MatDot codes, and PolyDot codes. However, these RS-based schemes are not efficient for small finite fields because the distributed order (i.e., the total number of worker nodes) is limited by the size of the underlying finite field. Algebraic geometry (AG) codes can have a code length exceeding the size of the finite field, which helps solve this problem. Some work has been done to generalize Polynomial and MatDot codes to AG codes, but the generalization of PolyDot codes to AGcodes still remains an open problem as far as we know. This is because functions of an algebraic curve do not behave as nicely as polynomials. In this work, by using local expansions of functions, we are able to generalize the three DMM schemes based on RS codes to AG codes. Specifically, we provide a construction of AG-based PolyDot codes for the first time. In addition, our AG-based Polynomial and MatDot codes achieve better recovery thresholds compared to previous AG-based DMM schemes while maintaining similar communication costs. Our constructions are based on a novel basis of the Riemann-Roch space using local expansions, which naturally generalizes the standard monomial basis of the univariate polynomial space in RS codes. In contrast, previous work used the non-gap numbers to construct a basis of the Riemann-Roch space, which can cause cancellation problems that prevent the conditions of PolyDot codes from being satisfied.
Information Theory,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
This paper attempts to address two main challenges in Distributed Matrix Multiplication (DMM): communication cost and recovery threshold. Specifically: 1. **Communication Cost**: In distributed computing, data needs to be exchanged between worker nodes, which will introduce significant communication overhead. 2. **Recovery Threshold**: The recovery threshold refers to the minimum number of worker nodes required to successfully recover the product of two matrices. In practical applications, due to "straggler" worker nodes (i.e., nodes that run slowly or are prone to delay), the recovery threshold has become a crucial issue. ### Background Traditional Reed - Solomon (RS) - code - based distributed matrix multiplication schemes are not efficient in small finite fields because the number of worker nodes is limited by the size of the finite field \( N \leq q \). Algebraic Geometry (AG) codes, as a generalization of RS codes, can break this limit, allowing the code length to exceed the size of the finite field, thus solving this problem. ### Main Contributions 1. **AG - based Polynomial DMM**: - Proposed an AG - based Polynomial DMM scheme with a recovery threshold of \( R = 2g+mn \). - If \( m \in W(P) \) or \( n \in W(P) \), the recovery threshold can be further reduced to \( R = g + mn \). 2. **AG - based MatDot DMM**: - Proposed an AG - based MatDot DMM scheme with a recovery threshold of \( R = 2g + 2p-1 \). 3. **AG - based PolyDot DMM**: - Proposed an AG - based PolyDot DMM scheme for the first time, with a recovery threshold of: \[ R=\begin{cases} 4g+(2p - 1)mn+2mn - 2m & \text{if } m = 1 \text{ or } m\geq n\geq 2,\\ 4g+(2p - 1)mn+2mn - 2n & \text{if } n = 1 \text{ or } n > m\geq 2. \end{cases} \] ### Technical Methods - **Basis of Riemann - Roch Space**: Construct a new basis of the Riemann - Roch space through local expansion, which naturally generalizes the standard monomial basis. - **Special Divisors**: Use some special divisors instead of single - point divisors to expand the Riemann - Roch space, to avoid the cancellation problem and achieve a better recovery threshold. ### Performance Comparison Compared with the known RS - code - based DMM schemes, the AG - based DMM schemes proposed in this paper have lower bit complexity under the same distributed order \( N \), especially in the case of \( N > q \). Moreover, if the genus \( g \) of the algebraic function field satisfies certain conditions, the bit complexity of downloading is also better than that of the RS - code - based DMM schemes. ### Conclusion This paper effectively solves the communication cost and recovery threshold problems in distributed matrix multiplication by introducing AG - code - based DMM schemes, especially for the cases of large matrices and small finite fields. These new schemes not only improve the recovery threshold but also maintain a similar communication cost, providing a more efficient method for distributed computing.