Generalized Fault-Tolerance Topology Generation for Application-Specific Network-on-Chips
Song Chen,Mengke Ge,Zhigang Li,Jinglei Huang,Qi Xu,Feng Wu
DOI: https://doi.org/10.1109/tcad.2019.2952134
IF: 2.9
2020-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:The network-on-chips (NoCs)-based communication architecture is a promising candidate for addressing communication bottlenecks in many-core processors and neural network processors. In this article, we consider the generalized faulttolerance topology generation problem, where the link (physical channel) or switch failures can happen, for application-specific NoCs (ASNoCs). With a user-defined maximum number of faults, K, we propose an integer linear programming (ILP)-based method to generate ASNoC topologies, which can tolerate at most K faults in switches or links. Given the communication requirements between cores and their floorplan, we first propose a convex-cost-flow-based method to solve a core mapping (CM) problem for building connections between the cores and switches. Second, an ILP-based method is proposed to solve the routing path allocation (PA) problem, where K+ 1 switch-disjoint routing paths are allocated for every communication flow between the cores. Finally, to reduce switch sizes, we propose to share the switch ports for the connections between the cores and switches and formulate the port sharing problem as a clique-partitioning problem, which is solved by iteratively finding a set of the maximum cliques. Additionally, we propose an ILP-based method to simultaneously solve the CM and routing PA problems when only physical link failures are considered. The experimental results show that the power consumption of fault-tolerance topologies increases almost linearly with K because of the routing path redundancy for fault tolerance. When both switch faults and link faults are considered, port sharing can reduce the average power consumption of fault-tolerance topologies with K = 1, K = 2, and K = 3 by 18.08%, 28.88%, and 34.20%, respectively. When considering only the physical link faults, the experimental results show that compared to the fault-tolerant topology generation (FTTG) algorithm, the proposed method reduces power consumption and hop count by 10.58% and 6.25%, respectively; compared to the de Bruijn Digraph (DBG)-based method, the proposed method reduces power consumption and hop count by 21.72% and 9.35%, respectively.