PowerBEV: A Powerful Yet Lightweight Framework for Instance Prediction in Bird's-Eye View

Peizheng Li,Shuxiao Ding,Xieyuanli Chen,Niklas Hanselmann,Marius Cordts,Juergen Gall
2023-06-19
Abstract:Accurately perceiving instances and predicting their future motion are key tasks for autonomous vehicles, enabling them to navigate safely in complex urban traffic. While bird's-eye view (BEV) representations are commonplace in perception for autonomous driving, their potential in a motion prediction setting is less explored. Existing approaches for BEV instance prediction from surround cameras rely on a multi-task auto-regressive setup coupled with complex post-processing to predict future instances in a spatio-temporally consistent manner. In this paper, we depart from this paradigm and propose an efficient novel end-to-end framework named POWERBEV, which differs in several design choices aimed at reducing the inherent redundancy in previous methods. First, rather than predicting the future in an auto-regressive fashion, POWERBEV uses a parallel, multi-scale module built from lightweight 2D convolutional networks. Second, we show that segmentation and centripetal backward flow are sufficient for prediction, simplifying previous multi-task objectives by eliminating redundant output modalities. Building on this output representation, we propose a simple, flow warping-based post-processing approach which produces more stable instance associations across time. Through this lightweight yet powerful design, POWERBEV outperforms state-of-the-art baselines on the NuScenes Dataset and poses an alternative paradigm for BEV instance prediction. We made our code publicly available at: <a class="link-external link-https" href="https://github.com/EdwardLeeLPZ/PowerBEV" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Robotics
What problem does this paper attempt to address?
The paper aims to address the problem of accurately obtaining information about surrounding vehicles and predicting their future movements in autonomous driving systems. Specifically, the paper proposes a new framework called PowerBEV to simplify the multi-task settings in existing methods and improve prediction performance. The main issues the paper attempts to solve are as follows: 1. **Simplifying Multi-Task Settings**: Existing methods rely on multiple redundant representations (such as segmentation maps, instance centers, forward flows, etc.) when predicting future instances. These representations not only require complex loss functions but also need complicated post-processing steps. PowerBEV simplifies this process by using only segmentation maps and centripetal backward flows. 2. **Improving Prediction Accuracy**: By using parallel multi-scale modules and a design based on 2D convolutional networks, PowerBEV can improve prediction accuracy while reducing redundancy. Specifically, this framework outperforms existing baseline methods on the NuScenes dataset. 3. **Optimizing Label Generation**: Researchers found that existing methods (such as FIERY) have systematic errors in data preprocessing and label generation, leading to decreased prediction performance. Therefore, they proposed an improved label generation method to address these issues and enhance prediction results. 4. **Lightweight Design**: Compared to existing complex models, PowerBEV adopts a lightweight design, making the entire framework more efficient and easier to deploy. In summary, the core objective of the paper is to enhance the ability to obtain information about surrounding vehicles and predict their future movements in autonomous driving systems by simplifying the prediction process and optimizing the label generation method, while ensuring high accuracy.