Multi-exit DNN inference acceleration for intelligent terminal with heterogeneous processors

Jinghui Zhang,Weilong Xin,Dingyang Lv,Jiawei Wang,Guangxing Cai,Fang Dong
DOI: https://doi.org/10.1016/j.suscom.2023.100906
2023-08-23
Sustainable Computing: Informatics and Systems
Abstract:Recently, there has been a burgeoning popularity in the deployment of deep learning vision applications upon terminal devices. However, as the number of layers in deep neural networks (DNNs) and structural complexity increase, although the performance of DNN in handling computer vision tasks has become powerful, model inference tasks on computation resource constrained intelligent terminal devices are frequently incapable of meeting latency requirement. A commonly adopted solution to inference acceleration presents multi-exit DNNs to reduce latency via the provision of early exits. However, existing methods do not fully utilize the potential of heterogeneous processors (GPU/CPU) on intelligent terminal devices to cooperatively accelerate multi-exit DNN inference in parallel. Furthermore, the impact of complex image and video input on multi-exit DNNs, as well as the effects of different power consumption modes on processors within intelligent terminal devices, remain inadequately explored. To address these issues, we comprehensively considered the computing performance of heterogeneous processors in different power consumption modes, the structure and characteristic of multi-exit DNNs in inference acceleration, and proposed the C ollaborative I nference A cceleration mechanism for intelligent terminal with H eterogeneous P rocessors (CIAHP). CIAHP includes a deep neural network computation time prediction model and a multi-exit DNN task allocation algorithm with heterogeneous processors. Our experiments demonstrate that CIAHP performs multi-exit DNN inference 2.31× faster than CPU alone, and is 1.23× faster than GPU alone when processing complex image samples.
computer science, information systems, hardware & architecture
What problem does this paper attempt to address?