Optimized co-scheduling of mixed-precision neural network accelerator for real-time multitasking applications

Wei Jiang,Ziwei Song,Jinyu Zhan,Zhiyuan He,Xiangyu Wen,Ke Jiang
DOI: https://doi.org/10.1016/j.sysarc.2020.101775
IF: 5.836
2020-11-01
Journal of Systems Architecture
Abstract:<p>Neural networks are increasingly applied into real-time and embedded Artificial Intelligent (AI) systems like autonomous driving system. Such resource-constrained systems cannot support the execution of neural network based tasks due to their high execution overheads on general processors. Hence, we are approaching to design real-time AI applications on embedded systems with CPU and FPGA (Field Programmable Gate Array) coprocessors. We use dedicated FPGA to accelerate the neural network job and utilize CPU to process the rest jobs of real-time multitasking applications. We devise an Idle-Aware Earliest Deadline First policy to co-schedule the AI applications on hybrid CPU and FPGA coprocessors. Since the implementation of neural network job on FPGA accelerator with different precision configuration will result in different execution time and accuracy, we are also interested in the design optimization of real-time AI applications running on mixed-precision neural network accelerator, with the purpose of maximizing the accuracy related rewards of all applications subject to real-time related constraints. We address the problem as a multi-stage decision procedure, and propose an efficient dynamic programming approach with two pruning policies to reduce the intermediate searching states. Extensive experiments and real-life case evaluations demonstrate the efficiency of the proposed approaches.</p>
computer science, software engineering, hardware & architecture
What problem does this paper attempt to address?