CoNST: Code Generator for Sparse Tensor Networks

Saurabh Raje,Yufan Xu,Atanas Rountev,Edward F. Valeev,Saday Sadayappan
2024-01-10
Abstract:Sparse tensor networks are commonly used to represent contractions over sparse tensors. Tensor contractions are higher-order analogs of matrix multiplication. Tensor networks arise commonly in many domains of scientific computing and data science. After a transformation into a tree of binary contractions, the network is implemented as a sequence of individual contractions. Several critical aspects must be considered in the generation of efficient code for a contraction tree, including sparse tensor layout mode order, loop fusion to reduce intermediate tensors, and the interdependence of loop order, mode order, and contraction order. We propose CoNST, a novel approach that considers these factors in an integrated manner using a single formulation. Our approach creates a constraint system that encodes these decisions and their interdependence, while aiming to produce reduced-order intermediate tensors via fusion. The constraint system is solved by the Z3 SMT solver and the result is used to create the desired fused loop structure and tensor mode layouts for the entire contraction tree. This structure is lowered to the IR of the TACO compiler, which is then used to generate executable code. Our experimental evaluation demonstrates very significant (sometimes orders of magnitude) performance improvements over current state-of-the-art sparse tensor compiler/library alternatives.
Programming Languages,Distributed, Parallel, and Cluster Computing,Performance
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the efficiency problem of code generation in sparse tensor networks. Specifically, the authors propose CoNST (Code Generator for Sparse Tensor Networks), a new method for generating efficient code to perform sparse tensor network contractions. The paper mainly solves the following key problems: 1. **Sparse tensor layout pattern order**: - Sparse tensors are usually represented in the CSF (Compressed Sparse Fiber) format, which stores non - zero elements through a nested structure. To access these non - zero elements efficiently, an appropriate pattern order needs to be selected. Different pattern orders will affect the computational performance, so how to select the optimal pattern order is an important issue. 2. **Loop fusion to reduce intermediate tensors**: - When dealing with multi - tensor contractions, temporary intermediate tensors will be generated, and these intermediate tensors may be very large, resulting in high memory usage. By fusing common loops, the size of intermediate tensors can be significantly reduced, thereby improving computational efficiency. For example, by fusing common index loops (such as \(i\) and \(j\)), the intermediate tensor can be reduced from four - dimensional to two - dimensional. 3. **Inter - dependence between loop order, pattern order and contraction order**: - When generating efficient contraction tree code, loop order, pattern order and contraction order need to be considered simultaneously. There are complex inter - dependent relationships among these factors. For example, choosing a certain pattern order may affect which loops can be fused, and the contraction order will affect the choice of pattern order. Existing works have not systematically considered these inter - dependent factors. To solve these problems, CoNST proposes a constraint - based synthesis method. It creates a constraint system to encode these decisions and their inter - dependencies, and uses the Z3 SMT solver to solve this constraint system. The finally generated code can significantly improve the performance of sparse tensor network contractions, and sometimes it can even be several orders of magnitude higher than the existing state - of - the - art sparse tensor compilers / libraries. ### Main contributions of the paper - **Proposing a new constraint - based method**: for encoding possible loop fusion structures and tensor CSF layouts to reduce the order of intermediate tensors. - **Developing a method for transforming constraint solutions into TACO compiler IR**: ensuring that the generated code can be executed efficiently. - **Verifying the superiority of CoNST through extensive experiments**: compared with existing systems such as TACO, SparseLNR and Sparta, CoNST shows significant performance improvements in multiple benchmark tests. ### Summary CoNST mainly solves the efficiency problem of code generation in sparse tensor networks. In particular, by optimizing the sparse tensor layout pattern order, loop fusion and considering the inter - dependencies among these factors, it generates efficient contraction tree code.