Leveraging the VTA-TVM Hardware-Software Stack for FPGA Acceleration of 8-bit ResNet-18 Inference

Thierry Moreau,Tianqi Chen,Luis Ceze
DOI: https://doi.org/10.1145/3229762.3229766
2018-06-20
Abstract:We present a full-stack design to accelerate deep learning inference with FPGAs. Our contribution is two-fold. At the software layer, we leverage and extend TVM, the end-to-end deep learning optimizing compiler, in order to harness FPGA-based acceleration. At the the hardware layer, we present the Versatile Tensor Accelerator (VTA) which presents a generic, modular, and customizable architecture for TPU-like accelerators. Our results take a ResNet-18 description in MxNet and compiles it down to perform 8-bit inference on a 256-PE accelerator implemented on a low-cost Xilinx Zynq FPGA, clocked at 100MHz. Our full hardware acceleration stack will be made available for the community to reproduce, and build upon at http://github.com/uwsaml/vta.
What problem does this paper attempt to address?