Abstract:Recent advances in high-definition (HD) map construction from surround-view images have highlighted their cost-effectiveness in deployment. However, prevailing techniques often fall short in accurately extracting and utilizing road features, as well as in the implementation of view transformation. In response, we introduce HeightMapNet, a novel framework that establishes a dynamic relationship between image features and road surface height distributions. By integrating height priors, our approach refines the accuracy of Bird's-Eye-View (BEV) features beyond conventional methods. HeightMapNet also introduces a foreground-background separation network that sharply distinguishes between critical road elements and extraneous background components, enabling precise focus on detailed road micro-features. Additionally, our method leverages multi-scale features within the BEV space, optimally utilizing spatial geometric information to boost model performance. HeightMapNet has shown exceptional results on the challenging nuScenes and Argoverse 2 datasets, outperforming several widely recognized approaches. The code will be available at \url{<a class="link-external link-https" href="https://github.com/adasfag/HeightMapNet/" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on several key challenges in high - definition map (HD Map) construction: 1. **Accuracy of road feature extraction and utilization**: Existing technologies have deficiencies in accurately extracting and utilizing road features from surround - view images, which affects the quality and reliability of high - definition maps. 2. **Implementation of view transformation**: Existing methods are often not precise enough when implementing view transformation, especially when dealing with complex environmental details, such as the height distribution of the road surface, which limits the model's comprehensive understanding of the environment. 3. **Separation of background and foreground**: Most existing studies fail to effectively filter non - critical elements (such as the sky and other irrelevant background features) when dealing with multi - view input image features, causing the model to be easily interfered by irrelevant data and affecting the accuracy and reliability of perception output. 4. **Utilization of multi - scale features**: Current research tends to focus on the utilization of single - layer image features to improve computational efficiency, but ignores the benefits of multi - scale feature fusion in the BEV space, which limits the effectiveness of the model in navigating complex road environments. To solve the above problems, the paper proposes the HeightMapNet framework, which improves the performance of high - definition map construction through the following innovations: - **Advanced view transformation module**: Dynamically links image features with the height distribution of the road surface, significantly enhancing spatial understanding ability and being able to be seamlessly integrated into the attention - based neural network. - **Foreground - background separation network**: Uses self - supervised learning to optimize feature extraction in the road environment, removes irrelevant background elements, improves the clarity and quality of input features, and thus enhances the reliability of perception results. - **Multi - scale feature fusion mechanism**: Realizes multi - scale feature fusion in the BEV space, improves the accuracy and robustness of map construction, especially in complex road environments. These innovations work together to make HeightMapNet perform excellently on challenging datasets such as nuScenes and Argoverse 2, surpassing several widely recognized methods.

HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning

LODM: Large-scale Online Dense Mapping for UAV

HDMapNet: An Online HD Map Construction and Evaluation Framework

ScalableMap: Scalable Map Learning for Online Long-Range Vectorized HD Map Construction

HoMap: End-to-End Vectorized HD Map Construction with High-order Modeling

VectorMapNet: End-to-end Vectorized HD Map Learning

PriorMapNet: Enhancing Online Vectorized HD Map Construction with Priors

EAN-MapNet: Efficient Vectorized HD Map Construction with Anchor Neighborhoods

HDMapNet: A Local Semantic Map Learning and Evaluation Framework.

StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map Construction

Deep Height Decoupling for Precise Vision-based 3D Occupancy Prediction

HeightFormer: Explicit Height Modeling without Extra Data for Camera-only 3D Object Detection in Bird's Eye View

P-MapNet: Far-seeing Map Generator Enhanced by both SDMap and HDMap Priors

HeightLane: BEV Heightmap guided 3D Lane Detection

Complementing Onboard Sensors with Satellite Maps: A New Perspective for HD Map Construction

Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks

MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction

TICMapNet: A Tightly Coupled Temporal Fusion Pipeline for Vectorized HD Map Learning

Online Vectorized HD Map Construction using Geometry

Complementing Onboard Sensors with Satellite Map: A New Perspective for HD Map Construction