Task-aware Swapping for Efficient DNN Inference on DRAM-constrained Edge Systems

Cheng Ji,Zongwei Zhu,Xianmin Wang,Wenjie Zhai,Xuemei Zong,Anqi Chen,Mingliang Zhou
DOI: https://doi.org/10.1002/int.22933
IF: 8.993
2022-01-01
International Journal of Intelligent Systems
Abstract:Object detection at the edge side is a common task in various environments. The deployment of convolutional neural networks in intelligent edge systems is very challenging because of the highly constrained main-memory space. This study aims at operating neural networks with a reduced memory requirement. The basic idea is that tasks of the same type would involve the same critical subnetwork. We propose identifying the critical network connections by considering the importance of channels. During runtime, the proposed method detects the task types and timely swaps the model parameters of the critical subnetworks from the external storage into dynamic random access memory (DRAM). Compared with conventional network pruning, the proposed approach further reduced the DRAM requirement by 34.6% while maintaining a high inference accuracy.
What problem does this paper attempt to address?