Abstract:Bird's eye view (BEV) semantic maps have evolved into a crucial element of urban intelligent traffic management and monitoring, offering invaluable visual and significant data representations for informed intelligent city decision making. Nevertheless, current methodologies continue underutilizing the temporal information embedded within dynamic frames throughout the BEV feature transformation process. This limitation results in decreased accuracy when mapping high-speed moving objects, particularly in capturing their shape and dynamic trajectory. A framework is proposed for cross-view semantic segmentation to address this challenge, leveraging simulated environments as a starting point before applying it to real-life urban imaginative transportation scenarios. The view converter module is thoughtfully designed to collate information from multiple initial view observations captured from various angles and modes. This module outputs a top-down view semantic graph characterized by its object space layout to preserve beneficial temporal information in BEV transformation. The NuScenes dataset is used to evaluate model effectiveness. A novel application is also devised that harnesses transformer networks to map images and video sequences into top-down or comprehensive bird's-eye views. By combining physics-based and constraint-based formulations and conducting ablation studies, the approach has been substantiated, highlighting the significance of context above and below a given point in generating these maps. This innovative method has been thoroughly validated on the NuScenes dataset. Notably, it has yielded state-of-the-art instantaneous mapping results, with particular benefits observed for smaller dynamic category displays. The experimental findings include comparing axial attention with the state-of-the-art (SOTA) model, demonstrating the performance enhancement associated with temporal awareness.

Efficient and Hybrid Decoder for Local Map Construction in Bird'-Eye-View.

HDMapNet: A Local Semantic Map Learning and Evaluation Framework.

HDMapNet: An Online HD Map Construction and Evaluation Framework

ScalableMap: Scalable Map Learning for Online Long-Range Vectorized HD Map Construction

Predicting Maps Using In-Vehicle Cameras for Data-Driven Intelligent Transport

Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks

EAN-MapNet: Efficient Vectorized HD Map Construction with Anchor Neighborhoods

HybriMap: Hybrid Clues Utilization for Effective Vectorized HD Map Construction

Enhancing Online Road Network Perception and Reasoning with Standard Definition Maps

HoMap: End-to-End Vectorized HD Map Construction with High-order Modeling

Monocular BEV Perception of Road Scenes Via Front-to-Top View Projection

End-to-End Vectorized HD-map Construction with Piecewise Bezier Curve

VectorMapNet: End-to-end Vectorized HD Map Learning

Bi-Mapper: Holistic BEV Semantic Mapping for Autonomous Driving

BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View

U-BEV: Height-aware Bird's-Eye-View Segmentation and Neural Map-based Relocalization

Improving Bird’s Eye View Semantic Segmentation by Task Decomposition

HDNET: Exploiting HD Maps for 3D Object Detection

Complementing Onboard Sensors with Satellite Map: A New Perspective for HD Map Construction

MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction

MemFusionMap: Working Memory Fusion for Online Vectorized HD Map Construction