Abstract:Large matrix multiplications are central to large-scale machine learning applications. These operations are often carried out on a distributed computing platform with a master server and multiple workers in the cloud operating in parallel. For such distributed platforms, it has been recently shown that coding over the input data matrices can reduce the computational delay, yielding a trade-off between recovery threshold, i.e., the number of workers required to recover the matrix product, and communication load, i.e., the total amount of data to be downloaded from the workers. In this paper, in addition to exact recovery requirements, we impose security and privacy constraints on the data matrices, and study the recovery threshold as a function of the communication load. We first assume that both matrices contain private information and that workers can collude to eavesdrop on the content of these data matrices. For this problem, we introduce a novel class of secure codes, referred to as secure generalized PolyDot (SGPD) codes, that generalize state-of-the-art non-secure codes for matrix multiplication. SGPD codes allow a flexible trade-off between recovery threshold and communication load for a fixed maximum number of colluding workers while providing perfect secrecy for the two data matrices. We then study a connection between secure matrix multiplication and private information retrieval. We specifically assume that one of the data matrices is taken from a public set known to all the workers. In this setup, the identity of the matrix of interest should be kept private from the workers. For this model, we present a variant of generalized PolyDot codes that can guarantee both secrecy of one matrix and privacy for the identity of the other matrix for the case of no colluding servers.

Near-Optimal Fault Tolerance for Efficient Batch Matrix Multiplication via an Additive Combinatorics Lens

On the Optimal Recovery Threshold of Coded Matrix Multiplication

Coded Computing for Resilient, Secure, and Privacy-Preserving Distributed Matrix Multiplication

Straggler Mitigation in Distributed Matrix Multiplication: Fundamental Limits and Optimal Coding.

Entangled Polynomial Codes for Secure, Private, and Batch Distributed Matrix Multiplication: Breaking the "cubic" Barrier

Distributed Matrix Multiplication with a Smaller Recovery Threshold through Modulo-based Approaches

Variable Coded Batch Matrix Multiplication

Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication.

Algebraic Geometric Rook Codes for Coded Distributed Computing

Rateless Codes for Near-Perfect Load Balancing in Distributed Matrix-Vector Multiplication

Rateless Codes for Private Distributed Matrix-Matrix Multiplication

Folded Polynomial Codes for Coded Distributed $AA^\top$-Type Matrix Multiplication

Matrix Multiplication Verification Using Coding Theory

Distributed Matrix Computations with Low-weight Encodings

Algorithmic Based Fault Tolerance Applied to High Performance Computing

Random Alloy Codes and the Fundamental Limits of Coded Distributed Tensors

Coded Sparse Matrix Multiplication

Faster Matrix Multiplication Via Asymmetric Hashing

Fast Matrix Multiplication Without Tears: A Constraint Programming Approach

An Application of Storage-Optimal MatDot Codes for Coded Matrix Multiplication: Fast k-Nearest Neighbors Estimation

Private and Secure Distributed Matrix Multiplication With Flexible Communication Load