HIR: An MLIR-based Intermediate Representation for Hardware Accelerator Description

Kingshuk Majumder,Uday Bondhugula
DOI: https://doi.org/10.48550/arXiv.2103.00194
2021-02-27
Abstract:The emergence of machine learning, image and audio processing on edge devices has motivated research towards power efficient custom hardware accelerators. Though FPGAs are an ideal target for energy efficient custom accelerators, the difficulty of hardware design and the lack of vendor agnostic, standardized hardware compilation infrastructure has hindered their adoption. This paper introduces HIR, an MLIR-based intermediate representation (IR) to describe hardware accelerator designs. HIR combines high level language features, such as loops and multi-dimensional tensors, with programmer defined explicit scheduling, to provide a high-level IR suitable for DSL compiler pipelines without compromising control over the micro-architecture of the accelerator. HIR's explicit schedules allow it to express fine-grained, synchronization-free parallelism and optimizations such as retiming and pipelining. Built as a dialect in MLIR, it draws from best IR practices learnt from communities like those of LLVM. While offering rich optimization opportunities and a high level abstraction, HIR enables sharing of optimizations, utilities and passes with software compiler infrastructure. Our implementation shows that the code generation time of the HIR code generator is on average 1112x lower than that of Xilinx Vivado HLS on a range of kernels without a compromise on the quality of the generated hardware. We believe that these are significant steps forward in the design of IRs for hardware synthesis and in equipping domain-specific languages with a productive and performing compilation path to custom hardware acceleration.
Hardware Architecture,Programming Languages
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges faced in the design of high - performance, low - power custom hardware accelerators on edge devices. Specifically, although FPGA (Field - Programmable Gate Array) is an ideal target for implementing energy - efficient custom accelerators, the difficulty of hardware design and the lack of a vendor - neutral standardized hardware compilation infrastructure have hindered its wide application. For this reason, this paper proposes an intermediate representation (IR) based on MLIR (Multi - Level Intermediate Representation) - HIR, which is used to describe hardware accelerator designs. HIR combines high - level language features (such as loops and multi - dimensional tensors) with programmer - defined explicit scheduling, providing a high - level IR suitable for the DSL (Domain - Specific Language) compiler pipeline without sacrificing control over the accelerator micro - architecture. The main features of HIR include: - **Explicit Scheduling**: Allows the expression of fine - grained, synchronization - free parallelism and optimizations (such as retiming and pipelining). - **Rich Optimization Opportunities**: Provides rich optimization opportunities and can share optimizations, tools, and passes with the software compiler infrastructure. - **High - Performance Code Generation**: Experiments show that the code generation time of the HIR code generator is on average 1,112 times lower than that of Xilinx Vivado HLS without affecting the quality of the generated hardware. Through these features, HIR aims to provide significant progress for the IR of hardware synthesis design and provide an efficient and well - performing compilation path to custom hardware acceleration for domain - specific languages.