A 28nm 1.2GHz 5.27TOPS/W Scalable Vision/Point Cloud Deep Fusion Processor with CAM-based Universal Mapping Unit for BEVFusion Applications.

Xiaoyu Feng,Wenyu Sun,Xinyuan Lin,Shupei Fan,Huazhong Yang,Yongpan Liu
DOI: https://doi.org/10.1109/CICC60959.2024.10529062
2024-01-01
Abstract:Multi-sensor perception has become crucial for emergent applications like autonomous driving or robot navigation. Deep learning methods, particularly those employing Bird's Eye View (BEV) fusion [1], are now setting benchmarks in environmental perception. However, they pose substantial challenges for traditional hardware, as depicted in Fig. 1. On the one hand, incorporating diverse sensors brings in more irregular operators, including image to BEV mapping by Lift Splat Shoot (LSS) [2], 3D point cloud to BEV mapping by z-axis flatten, and sparse CNNs for processing point clouds. These operators generate heavy memory mapping overhead on traditional CPU-based solutions. These irregular computations contain two steps: irregular address mapping and regular multiply-accumulate (MAC) computing. On the other hand, the increasing cost of high-performance chips make the single-chip solution struggle to keep up the exponentially increased model computational requirements. Chip-level scalability is essential to improve the performance.
What problem does this paper attempt to address?