PointTrans: Rethinking 3D Object Detection from a Translation Perspective with Transformer

Jingyang Liu,Yucheng Xu,Wanbiao Lin,Lei Sun
DOI: https://doi.org/10.23919/ccc58697.2023.10240559
2023-01-01
Abstract:3D object detection provides useful information of surrounding objects in complex environment which servers as the fundation for autonomous robot and self-driving vehicles to work safely outdoors and the premise of high-level application such like robot path planning and obstacle avoiding. In this paper, we propose a two-stage 3D object detection framework PointTrans. The whole framework is composed of two stages: stage-1 for the bottom-up object-entity recognition and localization via semantic segmentation and clustering; stage2 for treating bounding boxes regression task as a translation task via Transformer architecture. Our PointTrans not only detects objects in 3D scene with state-of-art performance, but also generated dense semantic representation for 3D scene. Moreover, we creatively embedded parameters of 3D bounding box into a “word” vector which can be iteratively learned and optimized acrossing Transformer decoder layers.
What problem does this paper attempt to address?