A Fully Quantitative Scheme With Fine-grained Tuning Method For Lightweight CNN Acceleration

Chen Yang,Bowen Li,Yizhou Wang
DOI: https://doi.org/10.1109/ICECS46596.2019.8964724
2019-01-01
Abstract:Based on the incremental network quantization (INQ) algorithm, this paper proposed a new quantization method with fine-grained tuning (named FT-INQ) to quantify convolutional weight of lightweight Convolutional Neural Network (CNN). The key idea of FT-INQ is to employ extra 2-bit tuning bits to provide more dense value space for the quantization process, so that the accuracy of floating-point CNN model can be well maintained. According to FT-INQ, the floating-point multiplication can be simplified to the sum of two shifting operations. Compared with original INQ algorithm, testing results shows that FT-INQ achieves 23% and 34% reduction on accuracy loss for MobilenetV1&V2, along with 2.5x improvements on training speed.
What problem does this paper attempt to address?