P2-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
Huihong Shi,Xin Cheng,Wendong Mao,Zhongfeng Wang
DOI: https://doi.org/10.1109/tvlsi.2024.3422684
2024-08-28
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Abstract:Vision transformers (ViTs) have excelled in computer vision (CV) tasks but are memory-consuming and computation-intensive, challenging their deployment on resource-constrained devices. To tackle this limitation, prior works have explored ViT-tailored quantization algorithms but retained floating-point scaling factors, which yield nonnegligible requantization overhead, limiting ViTs' hardware efficiency and motivating more hardware-friendly solutions. To this end, we propose P2-ViT, the first power-of-two (PoT) posttraining quantization (PTQ) and acceleration framework to accelerate fully quantized ViTs. Specifically, as for quantization, we explore a dedicated quantization scheme to effectively quantize ViTs with PoT scaling factors, thus minimizing the requantization overhead. Furthermore, we propose coarse-to-fine automatic mixed-precision quantization to enable better accuracy-efficiency tradeoffs. In terms of hardware, we develop a dedicated chunk-based accelerator featuring multiple tailored subprocessors to individually handle ViTs' different types of operations, alleviating reconfigurable overhead. In addition, we design a tailored row-stationary dataflow to seize the pipeline processing opportunity introduced by our PoT scaling factors, thereby enhancing throughput. Extensive experiments consistently validate P2-ViT's effectiveness. Particularly, we offer comparable or even superior quantization performance with PoT scaling factors when compared with the counterpart with floating-point scaling factors. Besides, we achieve up to speedup and energy saving over GPU's Turing Tensor Cores, and up to higher computation utilization efficiency against SOTA quantization-based ViT accelerators. Codes are available at https://github.com/shihuihong214/P2-ViT.
engineering, electrical & electronic,computer science, hardware & architecture