Activations Quantization for Compact Neural Networks

Yadong Wei,Zhixu Zhao,Taoyang,Zishuo Zhen,Zeyu Hao,Pengju Ren
DOI: https://doi.org/10.1109/icdsp.2018.8631650
2018-01-01
Abstract:As optimizing neural networks to improve processing speed has become a necessary process in industrial deployment, emerging challenges due to limited storage and computation when adopting neural networks on embedded system need to be addressed. There are some previous studies focusing on training neural networks with low precision weights and activations, like Binarized neural network and DoReFa-Net. Meanwhile, an alternative approach is deployed to reduce number of operations and model size, like MobileNet and SqueezeNet. In this work, by adopting SQ(Scale Quantization) method to quantify activations of each neuron, we have gained a more satisfying result especially in compact network models. For SqueezeDet on KITTI dataset, SQ achieves 22.86% more accuracy on means correct detections, while means location and classification error decrease by 41.84% and 68.57%, compared with DoReFa-Net with 2-bits activations. Additionally, we also propose the SQT(Scale Quantization with Threshold) method to further improve the quantization flexibility. It is worth mentioned that our approach is primarily designed for activations quantization, but it can also be complemented with other orthogonal methods to further achieving efficient neural network processing at the edge.
What problem does this paper attempt to address?