Algorithm and hardware codesign of sparse binary network on-chip

Xichuan Zhou,Haijun Liu,Cong Shi,Ji Liu
DOI: https://doi.org/10.1016/b978-0-32-385783-3.00016-8
2022-01-01
Abstract:Differently from last chapter, which focuses on the hardware accelerators design, in this chapter, we concentrate on algorithm and hardware codesign. Deep neural networks are the state-of-the-art models for understanding the content of images and videos. However, implementing deep neural networks in embedded systems is a challenging task, e.g., a typical deep neural network can exhaust gigabytes of memory and result in bandwidth and computational bottlenecks. To address this challenge, in this chapter, we present an algorithm and hardware codesign for efficient deep neural computation. We propose a hardware-oriented deep learning algorithm, named the Deep Adaptive Network, to explore the sparsity of neural connections. By adaptively removing the majority of neural connections and robustly representing the reserved connections using binary integers the proposed algorithm can save up to 99.9% memory utility and computational resources without undermining classification accuracy. An efficient sparse-mapping-memory-based hardware architecture is proposed to fully take advantage of the algorithmic optimization. Differently from the traditional Von Neumann architecture, the Deep-Adaptive-Network-on-Chip (DANoC) brings communication and computation in close proximity to avoid power-hungry parameter transfers between on-board memory and on-chip computational units. Experiments over different image classification benchmarks show that the DANoC system achieves competitively high accuracy and efficiency in comparison with the state-of-the-art approaches.1
What problem does this paper attempt to address?