AnchorPoint: Query Design for Transformer-Based 3D Object Detection and Tracking

Hao Liu,Yanni Ma,Hanyun Wang,Chaobo Zhang,Yulan Guo
DOI: https://doi.org/10.1109/tits.2023.3282204
IF: 8.5
2023-01-01
IEEE Transactions on Intelligent Transportation Systems
Abstract:With the success of Transformers in natural language processing, object detection with Transformers (DETR) has attracted widespread attentions. In previous Transformer-based 2D detectors, the object queries are a set of learning embeddings. However, it is very hard to apply these detectors to the 3D domain due to the lack of explicit physical meanings and position priors of learned object queries. In this paper, we introduce the concept of anchors and propose a novel query design based on anchor points. In our query design, we use the foreground points as the anchor points and encode these anchor points as the object queries. Consequently, each object query has an explicit physical meaning and only focus on its nearby object. Additionally, we also propose an instance-aware sampling strategy to select a small set of representation foreground points from the scene point cloud. Extensive experiments on several large-scale 3D object detection datasets demonstrate that the proposed AnchorPoint detector achieves promising accuracy and efficiency. In particularly, AnchorPoint achieves an average precision (AP) of 83.21 at 61 frame-per-second (FPS) on the moderate level of the KITTI-DET Car subset. Moreover, we model each object as its corresponding anchor point, and extend the AnchorPoint model to 3D multi-object tracking by adding an extra tracking head. We show that our method achieves comparable performance to existing state-of-the-art methods on the KITTI-MOT dataset.
engineering, electrical & electronic,transportation science & technology, civil
What problem does this paper attempt to address?