JointCS: Joint Search for Deep Model Compression and Segmentation on Heterogeneous IoT Devices

Xinyu Li,Bin Guo,Sicong Liu,Chen Qiu,Yunji Liang,Zhiwen Yu
DOI: https://doi.org/10.1109/icpads53394.2021.00059
2021-01-01
Abstract:Deep neural networks (DNNs) play an important role in a variety of intelligent applications (e.g. image classification and target recognition), yet at the cost of heavy computation burden, that makes DNNs difficult to deploy on resource-constrained IoT devices. To solve this problem, there are two categories of model computation adjustment methods: model compression and model segmentation. However, model compression mainly reduces resource consumption at the cost of accuracy while model segmentation reduces resource consumption according to the cost of communication latency. In this paper, we propose Joint Search for Model Compression and Segmentation (JointCS) that highlights the following aspects: 1) we integrate both model compression and model segmentation under an automatic and progressive framework, it simplifies model to fit the different IoT resource requirements. JointCS achieves a series slim models that outperform better both in accuracy and latency. 2) we train a network architecture-aware latency predictor to fast measure the latency of the slimed model on heterogeneous IoT devices. 3) we introduce a search algorithm to select the optimal state in progressively joint search. Finally, we evaluate the performance of our proposed method for image classification on CIFAR datasets comparing with the state-of-the-art approach, the inference time of the proposed method has inference speedup of 12.2 % −30.9 % under the same accuracy.
What problem does this paper attempt to address?