Tianjic: A Unified and Scalable Chip Bridging Spike-Based and Continuous Neural Computation
Lei Deng,Guanrui Wang,Guoqi Li,Shuangchen Li,Ling Liang,Maohua Zhu,Yujie Wu,Zheyu Yang,Zhe Zou,Jing Pei,Zhenzhi Wu,Xing Hu,Yufei Ding,Wei He,Yuan Xie,Luping Shi
DOI: https://doi.org/10.1109/jssc.2020.2970709
IF: 5.4
2020-01-01
IEEE Journal of Solid-State Circuits
Abstract:Toward the long-standing dream of artificial intelligence, two successful solution paths have been paved: 1) neuromorphic computing and 2) deep learning. Recently, they tend to interact for simultaneously achieving biological plausibility and powerful accuracy. However, models from these two domains have to run on distinct substrates, i.e., neuromorphic platforms and deep learning accelerators, respectively. This architectural incompatibility greatly compromises the modeling flexibility and hinders promising interdisciplinary research. To address this issue, we build a unified model description framework and a unified processing architecture (Tianjic), which covers the full stack from software to hardware. By implementing a set of integration and transformation operations, Tianjic is able to support spiking neural networks, biological dynamic neural networks, multilayered perceptron, convolutional neural networks, recurrent neural networks, and so on. A compatible routing infrastructure enables homogeneous and heterogeneous scalability on a decentralized many-core network. Several optimization methods are incorporated, such as resource and data sharing, near-memory processing, compute/access skipping, and intra-/inter-core pipeline, to improve performance and efficiency. We further design streaming mapping schemes for efficient network deployment with a flexible tradeoff between execution throughput and resource overhead. A 28-nm prototype chip is fabricated with >610-GB/s internal memory bandwidth. A variety of benchmarks are evaluated and compared with GPUs and several existing specialized platforms. In summary, the fully unfolded mapping can achieve significantly higher throughput and power efficiency; the semi-folded mapping can save 30x resources while still presenting comparable performance on average. Finally, two hybrid-paradigm examples, a multimodal unmanned bicycle and a hybrid neural network, are demonstrated to show the potential of our unified architecture. This article paves a new way to explore neural computing.