Abstract:DL compiler's primary function is to translate DNN programs written in high-level DL frameworks such as PyTorch and TensorFlow into portable executables. These executables can then be flexibly executed by the deployed host programs. However, existing DL compilers rely on a tracing mechanism, which involves feeding a runtime input to a neural network program and tracing the program execution paths to generate the computational graph necessary for compilation. Unfortunately, this mechanism falls short when dealing with modern dynamic neural networks (DyNNs) that possess varying computational graphs depending on the inputs. Consequently, conventional DL compilers struggle to accurately compile DyNNs into executable code. To address this limitation, we propose \tool, a general approach that enables any existing DL compiler to successfully compile DyNNs. \tool tackles the dynamic nature of DyNNs by introducing a compilation mechanism that redistributes the control and data flow of the original DNN programs during the compilation process. Specifically, \tool develops program analysis and program transformation techniques to convert a dynamic neural network into multiple sub-neural networks. Each sub-neural network is devoid of conditional statements and is compiled independently. Furthermore, \tool synthesizes a host module that models the control flow of the DyNNs and facilitates the invocation of the sub-neural networks. Our evaluation demonstrates the effectiveness of \tool, achieving a 100\% success rate in compiling all dynamic neural networks. Moreover, the compiled executables generated by \tool exhibit significantly improved performance, running between $1.12\times$ and $20.21\times$ faster than the original DyNNs executed on general-purpose DL frameworks.

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

TSCompiler: Efficient Compilation Framework for Dynamic-Shape Models

DISC: A Dynamic Shape Compiler for Machine Learning Workloads

BladeDISC: Optimizing Dynamic Shape Machine Learning Workloads Via Compiler Approach

Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Pattern-Based Dynamic Compilation System for CGRAs With Online Configuration Transformation

Relay: A High-Level Compiler for Deep Learning

CompilerDream: Learning a Compiler World Model for General Code Optimization

Large Language Models for Compiler Optimization

ALT: Breaking the Wall Between Data Layout and Loop Optimizations for Deep Learning Compilation

DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph Optimization

CMLCompiler: A Unified Compiler for Classical Machine Learning

Optimizing DNN Computation with Relaxed Graph Substitutions

Machine Learning in Compiler Optimisation

Optimizing Large Language Models for Dynamic Constraints through Human-in-the-Loop Discriminators

Enabling Tensor Language Model to Assist in Generating High-Performance Tensor Programs for Deep Learning.

Towards Compile-Time-Reducing Compiler Optimization Selection via Machine Learning

LazyTensor: combining eager execution with domain-specific compilers

A Formalism of DNN Accelerator Flexibility

RAF: Holistic Compilation for Deep Learning Model Training