PowerBEV: A Powerful Yet Lightweight Framework for Instance Prediction in Bird's-Eye View

Peizheng Li,Shuxiao Ding,Xieyuanli Chen,Niklas Hanselmann,Marius Cordts,Juergen Gall

2023-06-19

Abstract:Accurately perceiving instances and predicting their future motion are key tasks for autonomous vehicles, enabling them to navigate safely in complex urban traffic. While bird's-eye view (BEV) representations are commonplace in perception for autonomous driving, their potential in a motion prediction setting is less explored. Existing approaches for BEV instance prediction from surround cameras rely on a multi-task auto-regressive setup coupled with complex post-processing to predict future instances in a spatio-temporally consistent manner. In this paper, we depart from this paradigm and propose an efficient novel end-to-end framework named POWERBEV, which differs in several design choices aimed at reducing the inherent redundancy in previous methods. First, rather than predicting the future in an auto-regressive fashion, POWERBEV uses a parallel, multi-scale module built from lightweight 2D convolutional networks. Second, we show that segmentation and centripetal backward flow are sufficient for prediction, simplifying previous multi-task objectives by eliminating redundant output modalities. Building on this output representation, we propose a simple, flow warping-based post-processing approach which produces more stable instance associations across time. Through this lightweight yet powerful design, POWERBEV outperforms state-of-the-art baselines on the NuScenes Dataset and poses an alternative paradigm for BEV instance prediction. We made our code publicly available at: <a class="link-external link-https" href="https://github.com/EdwardLeeLPZ/PowerBEV" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition,Robotics

What problem does this paper attempt to address?

The paper aims to address the problem of accurately obtaining information about surrounding vehicles and predicting their future movements in autonomous driving systems. Specifically, the paper proposes a new framework called PowerBEV to simplify the multi-task settings in existing methods and improve prediction performance. The main issues the paper attempts to solve are as follows: 1. **Simplifying Multi-Task Settings**: Existing methods rely on multiple redundant representations (such as segmentation maps, instance centers, forward flows, etc.) when predicting future instances. These representations not only require complex loss functions but also need complicated post-processing steps. PowerBEV simplifies this process by using only segmentation maps and centripetal backward flows. 2. **Improving Prediction Accuracy**: By using parallel multi-scale modules and a design based on 2D convolutional networks, PowerBEV can improve prediction accuracy while reducing redundancy. Specifically, this framework outperforms existing baseline methods on the NuScenes dataset. 3. **Optimizing Label Generation**: Researchers found that existing methods (such as FIERY) have systematic errors in data preprocessing and label generation, leading to decreased prediction performance. Therefore, they proposed an improved label generation method to address these issues and enhance prediction results. 4. **Lightweight Design**: Compared to existing complex models, PowerBEV adopts a lightweight design, making the entire framework more efficient and easier to deploy. In summary, the core objective of the paper is to enhance the ability to obtain information about surrounding vehicles and predict their future movements in autonomous driving systems by simplifying the prediction process and optimizing the label generation method, while ensuring high accuracy.

PowerBEV: A Powerful Yet Lightweight Framework for Instance Prediction in Bird's-Eye View

PowerBEV: A Powerful Yet Lightweight Framework for Instance Prediction in Bird’s-Eye View

Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation

Fast and Efficient Transformer-based Method for Bird's Eye View Instance Prediction

BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving

PreBEV: Leveraging Predictive Flow for Enhanced Bird's-Eye View 3D Dynamic Object Detection

Fast-BEV: Towards Real-time On-vehicle Bird's-Eye View Perception

Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline

PointBeV: A Sparse Approach to BeV Predictions

Spatiotemporal BEV Pyramid Networks for Future Instance Prediction of Autonomous Driving

Vehicle Trajectory Prediction Method Driven by Raw Sensing Data for Intelligent Vehicles

BEVSeg2TP: Surround View Camera Bird's-Eye-View Based Joint Vehicle Segmentation and Ego Vehicle Trajectory Prediction

BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers

BEVScope: Enhancing Self-Supervised Depth Estimation Leveraging Bird's-Eye-View in Dynamic Scenarios

TBP-Former: Learning Temporal Bird's-Eye-View Pyramid for Joint Perception and Prediction in Vision-Centric Autonomous Driving

MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps

BEVGPT: Generative Pre-trained Large Model for Autonomous Driving Prediction, Decision-Making, and Planning

Depth-Assisted Camera-Based Bird's Eye View Perception for Autonomous Driving

Surrounding-aware representation prediction in Birds-Eye-View using transformers

MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps

FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras