Cooperative Inference Analysis Based on DNN Convolutional Kernel Partitioning

ZHI Jialin,TENG Yinglei,ZHANG Xinyang,NIU Tao,SONG Mei
DOI: https://doi.org/10.11959/j.issn.2096-3750.2022.00308
2022-01-01
Abstract:With the popularity of intelligent chip in the application of edge terminal devices, a large number of AI applications will be deployed on the edge of networks closer to data sources in the future.The method based on DNN partition can realize deep learning model training and deployment on resource-constrained terminal devices, and solve the bottleneck problem of edge AI computing ability.Thekernel based partition method (KPM) was proposed as a new scheme on the basis of traditional workload based partition method (WPM).The quantitative analysis of inference performance was carried out from three aspects of computation FLOPS, memory consumption and communication cost respectively, and the qualitative analysis of the above two schemes was carried out from the perspective of flexibility, robustness and privacy of inference process.Finally, a software and hardware experimental platform was built, and AlexNet and VGG11 networks were implemented using PyTorch to further verify the performance advantages of the proposed scheme in terms of delay and energy consumption.It was concluded that, compared with the WPM scheme, the KPM scheme had better DNN reasoning acceleration effect in large-scale computing scenarios.And it has lower memory usage and energy consumption.
What problem does this paper attempt to address?