EdgeNN: Efficient Neural Network Inference for CPU-GPU Integrated Edge Devices.

Chenyang Zhang,Feng Zhang,Kuangyu Chen,Mingjun Chen,Bingsheng He,Xiaoyong Du
DOI: https://doi.org/10.1109/icde55515.2023.00096
2023-01-01
Abstract:With the development of the architectures and the growth of AIoT application requirements, data processing on edge has become popular. Neural network inference is widely employed for data analytics on edge devices. This paper extensively explores neural network inference on integrated edge devices and proposes EdgeNN, the first neural network inference solution on CPU-GPU integrated edge devices. EdgeNN has three novel characteristics. First, EdgeNN can adaptively utilize the unified physical memory and conduct the zero-copy optimization. Second, EdgeNN involves a novel inference-targeted inter- and intra-kernel CPU-GPU hybrid execution approach, which co-runs the CPU with the GPU to fully utilize the edge device’s computing resources. Third, EdgeNN adopts a fine-grained adaptive inference tuning approach, which can divide the complicated inference structure into sub-tasks mapped to the CPU and the GPU. Experiments show that on six popular neural network inference tasks, EdgeNN brings an average of 3.97×, 3.12×, and 8.80× speedups to inference on the CPU of the integrated device, inference on a mobile phone CPU, and inference on an edge CPU device. Additionally, it achieves 22.02% time benefits to the direct execution of the original programs. Specifically, 9.93% comes from better utilization of unified memory, and 10.76% comes from CPU-GPU hybrid execution. Besides, EdgeNN can deliver 29.14× and 5.70× higher energy efficiency than the edge CPU and the discrete GPU, respectively. We have made EdgeNN available at https://github.com/ChenyangZhang-cs/EdgeNN.
What problem does this paper attempt to address?