SDAPNet: End-to-End Multi-task Simultaneous Detection and Prediction Network.

Shanding Ye,Han Yao,Wenfu Wang,Yongjian Fu,Zhijie Pan
DOI: https://doi.org/10.1109/ijcnn52387.2021.9533290
2021-01-01
Abstract:Accurate detection of objects in perception system is a basic task for autonomous vehicles to operate reliably. The current mainstream paradigms for perception and prediction are able to decompose into three sequential subtasks: detection, tracking and prediction. Much of the previous works focus on one of these subtasks and internal relationship between the various subtasks are therefore ignored. In this work we propose an end-to-end model that is able to jointly carry out 3D detection and motion prediction in the context of autonomous vehicles, which can benefit from joint optimization of two tasks. Considering computational consumption and memory cost, our approach utilizes 2D convolutions across space over a bird's eye view (BEV) representation of the 3D world, not 3D convolutions which require more inference time for a single point cloud. In this paper, the key model is the sequential multi-frame point clouds channels based on BEV, which can help perform object detection and learn temporal information at the same time. Experiments on nuScenes dataset show that the proposed network achieves state-of-the-art results, and the trade-off between accuracy and efficiency of two tasks are obtained. Additionally, we validate the effectiveness of our models through the ablation studies. Specifically, by sharing network, parameters and detection results we can perform two tasks within 32 FPS.
What problem does this paper attempt to address?