Galley: Modern Query Optimization for Sparse Tensor Programs

Kyle Deeds,Willow Ahrens,Magda Balazinska,Dan Suciu
2024-09-01
Abstract:The tensor programming abstraction has become a foundational paradigm for modern computing. This framework allows users to write high performance programs for bulk computation via a high-level imperative interface. Recent work has extended this paradigm to sparse tensors (i.e. tensors where most entries are not explicitly represented) with the use of sparse tensor compilers. These systems excel at producing efficient code for computation over sparse tensors, which may be stored in a wide variety of formats. However, they require the user to manually choose the order of operations and the data formats at every step. Unfortunately, these decisions are both highly impactful and complicated, requiring significant effort to manually optimize. In this work, we present Galley, a system for declarative sparse tensor programming. Galley performs cost-based optimization to lower these programs to a logical plan then to a physical plan. It then leverages sparse tensor compilers to execute the physical plan efficiently. We show that Galley achieves high performance on a wide variety of problems including machine learning algorithms, subgraph counting, and iterative graph algorithms.
Databases,Programming Languages
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to simplify and optimize the writing and execution of sparse tensor programs, so that users do not need to manually select the operation sequence and data format, thereby reducing the optimization workload and improving performance. Specifically, although existing Sparse Tensor Compilers (STCs) can efficiently handle sparse tensor calculations, they require users to manually determine the order of operations, the data format of intermediate results, the loop order, and the iterative algorithm. These decisions have a significant impact on performance, but are very complex and require a great deal of manual optimization work. Therefore, users face enormous challenges when writing sparse tensor programs. To solve this problem, the authors propose the Galley system, a framework for declarative sparse tensor programming. The main goals of Galley include: 1. **Automatic Optimization**: Galley converts the user's high - level sparse tensor program into an efficient physical execution plan through cost - based optimization methods, without the need for users to perform complex optimizations manually. 2. **Logical Optimization**: Galley rewrites the input program as a series of aggregation steps, minimizing the total calculation and materialization costs. 3. **Physical Optimization**: Galley selects the optimal loop order, output format, and merging algorithm to generate an efficient STC kernel. 4. **Sparsity Estimation**: Galley introduces a statistical framework to estimate the sparsity of intermediate results and guide the optimization process. Through these methods, Galley can achieve significant performance improvements on a variety of tasks, such as machine learning algorithms, sub - graph counting, and iterative graph algorithms. Experimental results show that Galley is 100 times faster than hand - optimized kernels on mixed dense - sparse workloads and 100 times faster than the state - of - the - art databases on highly sparse workloads. ### Formula Representation Some formulas involved in the article are represented in Markdown format as follows: - The expression of matrix chain multiplication: \[ E_{im}=\sum_{jkl}A_{ij}B_{jk}C_{kl}D_{lm} \] - The cost model in physical optimization: \[ cost\approx a\cdot nnz(Agg)+b\cdot nnz(MapExpr) \] where \(nnz\) represents the number of non - zero elements. In this way, Galley not only simplifies the writing of sparse tensor programs but also significantly improves their execution efficiency.