Aries: A DNN Inference Scheduling Framework for Multi-core Accelerators

Yunyi Xiang,Zheng Wu,Haidong Yao,Xiankui Xiong,Fan Yang
DOI: https://doi.org/10.1145/3670105.3670136
2024-01-01
Abstract:To effectively deploy the scaling-up Deep Neural Networks (DNN), the architecture of deep learning accelerators has evolved to multi-core architecture. Deploying these models to multi-core neural processor units (NPU) requires intricate processes such as segmentation, mapping, scheduling, and compiling instructions. Optimizing the entire deployment process represents a sophisticated challenge for the large scheduling search space. To address this challenge, we propose Aries, a DNN scheduling framework for multi-core accelerators. By adopting tensor parallelism and a genetic scheduling algorithm based on an accurate behavior-level architecture model, we have significantly accelerated the process of exploring the scheduling space while ensuring optimization efficiency.
What problem does this paper attempt to address?