F-Transformer: Point Cloud Fusion Transformer for Cooperative 3D Object Detection

Jie Wang,Guiyang Luo,Quan Yuan,Jinglin Li
DOI: https://doi.org/10.1007/978-3-031-15919-0_15
2022-01-01
Abstract:We present a novel cooperative detection framework to fuse multi-view point clouds, for accurately detecting hard samples (e.g., partly or fully occluded, or small objects). Building on a two-step communication scheme to transmit the pillar features between views, it is possible to observe the same object from different viewpoints. We then design a feature fusion scheme based on Transformer to fuse the pillar features by discretizing the point clouds. Considering the sparsity of information, we improve Transformer's self-attention mechanism, with Re-Scaled Dot-Product Attention, which allows the sparse information to capture valuable information more effectively. We evaluate the performance of our method by generating synthetic cooperative datasets over multiple complex traffic scenarios. The results show that our method surpasses all other cooperative perception methods with significant margins.
What problem does this paper attempt to address?