STFormer3D: Spatio-Temporal Transformer Based 3D Object Detection for Intelligent Driving.

Wei Liu,Yue Zhang,Haoxiang Jie,Jun Hu
DOI: https://doi.org/10.1145/3603781.3603857
2023-01-01
Abstract:This paper proposes a novel solution to the problem of efficiently detecting 3D objects in point clouds. By leveraging Convolutional Neural Networks (CNNs) and Transformer Networks, our method combines the strengths of both networks in feature extraction and long-range contextual information. To improve the detection performance under occlusion conditions, we propose a temporal fusion module to fuse the features of the current frame and the previous frame together. At the same time, we use BiFPN to effectively aggregate features of different scales.
What problem does this paper attempt to address?