A Low-Latency Framework with Algorithm-Hardware Co-Optimization for 3-D Point Cloud
Yue Yu,Wendong Mao,Jiapeng Luo,Zhongfeng Wang
DOI: https://doi.org/10.1109/tcsii.2023.3283142
2023-01-01
IEEE Transactions on Circuits & Systems II Express Briefs
Abstract:As an important type of 3D representation, the point cloud is widely used in many applications, such as autonomous driving, AR/VR, and intelligent robots, which require real-time interactions with humans. However, the sparsity of 3D point cloud data leads to severe computational inefficiency when being processed by 2D data processors, posing a huge challenge for hardware acceleration. In this brief, we aim at solving the inefficiency problem by algorithm-hardware co-optimization. Firstly, a lightweight network, named LPN, is proposed for point cloud data classification, which is $30\times $ smaller than pointnet and still has comparable accuracy. Secondly, a reconfigurable computing core, named RCC, together with an adaptive dataflow, is developed to support different layers of the LPN. Specifically, to accelerate memory-intensive layers, a partially-parallel computing scheme is introduced to minimize the on-chip memory requirements and DRAM accesses. Finally, based on the above innovations, a low-latency accelerator is proposed to realize real-time computation for the point cloud, which is implemented on the Xilinx Kintex UltraScale KCU150 FPGA board. Experimental results show that it achieves $1.5\times $ throughput improvement compared with the state-of-the-art works, and $35\times $ speedup over Intel Xeon Gold 6148 CPU, demonstrating the superiority of the proposed method. The code of LPN is available from https://github.com/snowsil/LPN-model-for-3D-classifification .