A tree-recursive partitioned multicast mechanism for NoC-based deep neural network accelerator
Yiming Ouyang,Yihe Zhang,Huaguo Liang,Jianhua Li
DOI: https://doi.org/10.1016/j.mejo.2024.106161
IF: 1.992
2024-03-17
Microelectronics Journal
Abstract:In chip multiprocessor systems (CMPs), Network on Chip (NoC) has been widely used due to its advantages of favorable reusability, high reliability, and low power consumption. Recently, using NoC platforms to accelerate deep neural networks (DNNs) has become a new trend. This design can enable the intermediate computation results of DNNs to be transmitted within the chip, reducing the number of accesses to off-chip memory. However, a large amount of one-to-many traffic in the DNN accelerator will occupy the system bandwidth, which will significantly reduce the performance of the NoC platform dominated by one-to-one traffic. To address this issue, we propose a tree-based recursive partitioning multicast scheme (TRPM), which increases the path diversity and improves the system bandwidth. We also design a single-cycle per-hop router architecture, which effectively enhances the transmission efficiency of multicast packets. Detailed simulation results show that compared with the latest tree-based multicast algorithm for DNN accelerators, our scheme reduces the number of routed packets by 35%, the classification latency by 13.5% and the average packet latency by 14.5% on average.
engineering, electrical & electronic,nanoscience & nanotechnology