Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

Ruihang Lai,Junru Shao,Siyuan Feng,Steven S. Lyubomirsky,Bohan Hou,Wuwei Lin,Zihao Ye,Hongyi Jin,Yuchen Jin,Jiawei Liu,Lesheng Jin,Yaxing Cai,Ziheng Jiang,Yong Wu,Sunghyun Park,Prakalp Srivastava,Jared G. Roesch,Todd C. Mowry,Tianqi Chen
2023-11-02
Abstract:Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models. The success of these models has driven demand for deploying them to a diverse set of backend environments. In this paper, we present Relax, a compiler abstraction for optimizing end-to-end dynamic machine learning workloads. Relax introduces first-class symbolic shape annotations to track dynamic shape computations globally across the program. It also introduces a cross-level abstraction that encapsulates computational graphs, loop-level tensor programs, and library calls in a single representation to enable cross-level optimizations. We build an end-to-end compilation framework using the proposed approach to optimize dynamic shape models. Experimental results on large language models show that Relax delivers performance competitive with state-of-the-art hand-optimized systems across platforms and enables deployment of emerging dynamic models to a broader set of environments, including mobile phones, embedded devices, and web browsers.
Machine Learning,Artificial Intelligence,Programming Languages
What problem does this paper attempt to address?