Compressing and Accelerating Neural Network for Facial Point Localization.

Dan Zeng,Fan Zhao,Wei Shen,Shiming Ge
DOI: https://doi.org/10.1007/s12559-017-9506-0
IF: 4.89
2017-01-01
Cognitive Computation
Abstract:State-of-the-art deep neural networks (DNNs) have greatly improved the accuracy of facial landmark localization. However, DNN models usually have a huge number of parameters which cause high memory cost and computational complexity. To address this issue, a novel method is proposed to compress and accelerate large DNN models while maintaining the performance. It includes three steps: (1) importance-based pruning: compared with traditional connection pruning, weight correlations are introduced to find and prune unimportant neurons or connections. (2) Product quantization: product quantization helps to enforce weights shared. With the same size codebook, product quantization can achieve higher compression rate than scalar quantization. (3) Network retraining: to reduce compression difficulty and performance degradation, the network is retrained iteratively after compressing one layer at a time. Besides, all pooling layers are removed and the strides of their neighbor convolutional layers are increased to accelerate the network simultaneously. The experimental results of compressing a VGG-like model demonstrate the effectiveness of our proposed method, which achieves 26 × compression and 4 × acceleration while the root mean squared error (RMSE) increases by just 3.6%.
What problem does this paper attempt to address?