AVM-SLAM: Semantic Visual SLAM with Multi-Sensor Fusion in a Bird's Eye View for Automated Valet Parking

Ye Li,Wenchao Yang,Dekun Lin,Qianlei Wang,Zhe Cui,Xiaolin Qin
2024-07-01
Abstract:Accurate localization in challenging garage environments -- marked by poor lighting, sparse textures, repetitive structures, dynamic scenes, and the absence of GPS -- is crucial for automated valet parking (AVP) tasks. Addressing these challenges, our research introduces AVM-SLAM, a cutting-edge semantic visual SLAM architecture with multi-sensor fusion in a bird's eye view (BEV). This novel framework synergizes the capabilities of four fisheye cameras, wheel encoders, and an inertial measurement unit (IMU) to construct a robust SLAM system. Unique to our approach is the implementation of a flare removal technique within the BEV imagery, significantly enhancing road marking detection and semantic feature extraction by convolutional neural networks for superior mapping and localization. Our work also pioneers a semantic pre-qualification (SPQ) module, designed to adeptly handle the challenges posed by environments with repetitive textures, thereby enhancing loop detection and system robustness. To demonstrate the effectiveness and resilience of AVM-SLAM, we have released a specialized multi-sensor and high-resolution dataset of an underground garage, accessible at <a class="link-external link-https" href="https://yale-cv.github.io/avm-slam_dataset" rel="external noopener nofollow">this https URL</a>, encouraging further exploration and validation of our approach within similar settings.
Robotics
What problem does this paper attempt to address?
This paper attempts to solve the problems of accurate mapping and localization when performing the Automated Valet Parking (AVP) task in challenging environments such as underground garages. Specifically, these environments usually have the following characteristics: 1. **Poor lighting conditions**: Underground garages are usually poorly lit, making it difficult for visual sensors to capture clear images. 2. **Sparse texture**: The interior of the garage lacks rich texture features, making traditional texture - based SLAM methods difficult to work effectively. 3. **Repeated structures**: There are a large number of similar structures and markings in the garage, which are prone to cause mismatches and affect the robustness and accuracy of the system. 4. **Dynamic scenes**: Vehicles and other objects in the garage move frequently, increasing the difficulty of localization. 5. **Lack of GPS signals**: Underground garages usually do not have GPS signals and cannot rely on the global positioning system for auxiliary positioning. To solve these problems, the author proposes a novel semantic visual SLAM architecture named AVM - SLAM, whose main features include: - **Multi - sensor fusion**: It combines four fisheye cameras, wheel - speed encoders and inertial measurement units (IMU) to improve the stability and accuracy of the system. - **Bird - eye - view (BEV) perspective**: By generating a bird - eye - view through the Around View Monitor (AVM) subsystem, the perception range and robustness of the system are enhanced. - **Halo removal technique**: The halo removal technique is applied in BEV images for the first time, which significantly improves the road - marking detection ability and the effect of semantic segmentation. - **Semantic pre - screening (SPQ) module**: An SPQ mechanism is designed to deal with the challenges in repeated - texture environments and improve the success rate of loop - closure detection and the overall performance of the system. In addition, in order to verify the effectiveness and robustness of the AVM - SLAM system, the author also releases an underground garage data set containing high - resolution multi - sensor data. This data set can be used for further exploration and verification of SLAM methods in similar environments. In summary, this paper aims to improve the mapping and localization accuracy of the automated valet parking task in complex underground garage environments by introducing innovative technologies and methods.