A Scalable 3D Array Architecture for Accelerating Convolutional Neural Networks

Yafei Ji,Xiang Wang,Yangfan Zhou,Cheng Chen,Li Jiang,Haoyuan Wang,Xuguang Wang,Xin Liu
DOI: https://doi.org/10.1007/978-981-16-9247-5_7
2022-01-01
Abstract:Convolutional neural network (CNN) is widely used in computer vision and image recognition, and the structure of the CNN becomes more and more complex. The complexity of CNN brings challenges of performance and storage capacity for hardware implementation. To address these challenges, in this paper, we propose a novel 3D array architecture for accelerating CNN. This proposed architecture has several benefits: Firstly, the strategy of multilevel caches is employed to improve data reusage, and thus reducing the access frequency to external memory; Secondly, performance and throughout are balanced among 3D array nodes by using novel workload and weight partitioning schemes. Thirdly, computing and transmission are performed simultaneously, resulting in higher parallelism and lower hardware storage requirement; Finally, the efficient data mapping strategy is proposed for better scalability of the entire system. The experimental results show that our proposed 3D array architecture can effectively improve the overall computing performance of the system.
What problem does this paper attempt to address?