QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation

Yuxin Li,Yiheng Li,Xulei Yang,Mengying Yu,Zihang Huang,Xiaojun Wu,Chai Kiat Yeo
2024-10-09
Abstract:Bird's-Eye-View (BEV) perception has become a vital component of autonomous driving systems due to its ability to integrate multiple sensor inputs into a unified representation, enhancing performance in various downstream tasks. However, the computational demands of BEV models pose challenges for real-world deployment in vehicles with limited resources. To address these limitations, we propose QuadBEV, an efficient multitask perception framework that leverages the shared spatial and contextual information across four key tasks: 3D object detection, lane detection, map segmentation, and occupancy prediction. QuadBEV not only streamlines the integration of these tasks using a shared backbone and task-specific heads but also addresses common multitask learning challenges such as learning rate sensitivity and conflicting task objectives. Our framework reduces redundant computations, thereby enhancing system efficiency, making it particularly suited for embedded systems. We present comprehensive experiments that validate the effectiveness and robustness of QuadBEV, demonstrating its suitability for real-world applications.
Robotics,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the computational efficiency issues of multi-task perception frameworks in autonomous driving systems. Specifically, the paper proposes an efficient four-task perception framework named **QuadBEV**, which integrates four key tasks through Bird's Eye View (BEV) representation: 3D object detection, lane detection, map segmentation, and occupancy prediction. These issues include: 1. **Computational Resource Constraints**: Traditional BEV methods are computationally intensive, making it difficult to deploy them on vehicles with limited computational resources. 2. **Challenges of Multi-task Learning**: - **Learning Rate Sensitivity**: Different tasks respond differently to the same learning rate, and the optimal learning rate for one task may affect the performance of another task. - **Task Objective Conflicts**: Each task may need to emphasize different feature aspects, leading to conflicts during the training process. ### Solution To address the above challenges, the paper proposes the **QuadBEV** framework, whose main features include: 1. **Multi-task Architecture**: Integrates the four key tasks into a unified framework by sharing a backbone network and task-specific heads. 2. **Progressive Training Strategy**: - **Feature Extractor Pre-training**: Pre-train the feature extractor using the map segmentation task. - **Multi-task Warm-up Training**: Freeze the parameters of the feature extraction layers and progressively train all task-specific heads, balancing tasks by adjusting learning rates and loss weights. - **End-to-end Training**: Eliminate the distinction between primary and auxiliary tasks, using a gradient-weighted algorithm to dynamically adjust loss weights to ensure balance between tasks. 3. **Experimental Validation**: Extensively validate the effectiveness and robustness of **QuadBEV** through experiments, demonstrating its potential application in real-world autonomous driving scenarios. ### Main Contributions 1. **Multi-task Architecture**: Proposes a framework that comprehensively handles four key tasks in autonomous driving. 2. **Progressive Training Strategy**: Designs a phased learning rate adjustment and gradient-based loss balancing technique to achieve balanced learning between tasks. 3. **Experimental Validation**: Validates the effectiveness and robustness of **QuadBEV** through extensive experiments, proving its potential in practical applications. ### Conclusion The **QuadBEV** framework not only improves the computational efficiency of multi-task perception but also maintains high performance, making it particularly suitable for real-time processing needs in embedded systems. Compared to traditional methods, **QuadBEV** significantly enhances computational efficiency and processing speed while maintaining performance comparable to existing state-of-the-art methods.