A 28-Nm Energy-Efficient Sparse Neural Network Processor for Point Cloud Applications Using Block-Wise Online Neighbor Searching

Xiaoyu Feng,Wenyu Sun,Chen Tang,Xinyuan Lin,Jinshan Yue,Huazhong Yang,Yongpan Liu
DOI: https://doi.org/10.1109/jssc.2024.3386878
IF: 5.4
2024-01-01
IEEE Journal of Solid-State Circuits
Abstract:Voxel-based point cloud networks composed of multiple kinds of sparse convolutions (SCONVs) play an essential role in emerging applications such as autonomous driving and visual navigation. Many researchers have proposed sparse processors for image applications. However, they cannot properly deal with three problems in the point cloud, including low efficiency of random memory access, non-parallel neighbor search and area overhead of supporting hybrid operators, and unbalanced workload among multiple cores. In this work, a 2-D/3-D unified SCONV accelerator is proposed with three key features: a block-wise sparse data storage format supporting out-of-order memory allocation and continuous memory access; a high-throughput and reconfigurable SCONV core providing unified support for multiple kinds of sparse CNNs; an asynchronous and synchronous hybrid scheduler for multiple cores with dynamic on-chip memory router to maximize data reusing and core utilization. This chip is fabricated in 28-nm CMOS technology and achieves 4.68-TOPS/W peak energy efficiency, 2 $\times$ higher than the previous accelerator. It is also the first accelerator to provide unified 2-D/3-D support and end-to-end inference ability for voxel-based point cloud networks.
What problem does this paper attempt to address?