A-U3D: A Unified 2D/3D CNN Accelerator on the Versal Platform for Disparity Estimation.
Tianyu Zhang,Dong Li,Hong Wang,Yunzhi Li,Xiang Ma,Wei Luo,Yu Wang,Yang Huang,Yi Li,Yu Zhang,Xinlin Yang,Xijie Jia,Qiang Lin,Lu Tian,Fan Jiang,Dongliang Xie,Hong Luo,Yi Shan
DOI: https://doi.org/10.1109/fpl57034.2022.00029
2022-01-01
Abstract:3-Dimensional (3D) convolutional neural networks (CNN) are widely used in the field of disparity estimation. However, 3D CNN is more computationally dense than 2D CNN due to the increase in the disparity dimension. To enable more practical applications in autonomous driving, robotics, and other scenarios on embedded devices, we propose a unified 2D/3D CNN accelerator (A-U3D) design. This design unifies 3D standard / transposed convolution into 2D standard convolution, respectively. Our processing unit can support 2D and 3D convolution in the same mode without additional structures. Based on PSMNet, a 3D-based CNN for disparity estimation, we build a heterogeneous multi-core system integrated with A-U3D in conjunction with CPU, DSP, and AI Engines on the Xilinx Versal ACAP platform. Running the pruned 8-bit model, our A-U3D system achieves 0.289s latency, which is 11.5 × faster than the state-of-the-art solution on the same platform, and reaches an end-to-end (E2E) performance of 10.1 frames per second (FPS). Our proposed system explores the feasibility of deploying 3D CNNs with large workloads on FPGA.