Abstract:In large-scale long-term dynamic environments, high-frequency dynamic objects inevitably lead to significant changes in the appearance of the scene at the same location at different times, which is catastrophic for place recognition (PR). Therefore, how to eliminate the influence of dynamic objects to achieve robust PR has universal practical value for mobile robots and autonomous vehicles. To this end, we suggest a novel semantically consistent LiDAR PR method based on chained cascade network, called SC_LPR, which mainly consists of a LiDAR semantic image inpainting network (LSI-Net) and a semantic pyramid Transformer-based PR network (SPT-Net). Specifically, LSI-Net is a coarse-to-fine generative adversarial network (GAN) with a gated convolutional autoencoder as the backbone. To effectively address the challenges posed by variable-scale dynamic object masks, we integrate the updated Transformer block with mask attention and gated trident block into LSI-Net. Sequentially, in order to generate a discriminative global descriptor representing the point cloud, we design an encoder with pyramid Transformer block to efficiently encode long-range dependencies and global contexts between different categories in the inpainted semantic image, followed by an augmented NetVALD, a generalized VLAD (Vector of Locally Aggregated Descriptors) layer that adaptively aggregates salient local features. Last but not least, we first attempt to create a LiDAR semantic inpainting dataset, called LSI-Dataset, to effectively validate the proposed method. Experimental comparisons show that our method not only improves semantic inpainting performance by about 6%, but also improves PR performance in dynamic environments by about 8% compared to the representative optimal baseline. LSI-Dataset will be publicly available at https://github.KD.LPR.com/.

Pyramid Learnable Tokens for 3D LiDAR Place Recognition.

RINet: Efficient 3D Lidar-Based Place Recognition Using Rotation Invariant Neural Network

Pyramid Point Cloud Transformer for Large-Scale Place Recognition.

3D LiDAR-Based Global Localization Using Siamese Neural Network

LiDAR‐based place recognition for mobile robots in ground/water surface multiple scenes

An Efficient 3-D Point Cloud Place Recognition Approach Based on Feature Point Extraction and Transformer

CCTNet: A Circular Convolutional Transformer Network for LiDAR-based Place Recognition Handling Movable Objects Occlusion

CSPFormer: A cross-spatial pyramid transformer for visual place recognition

AttDLNet: Attention-based DL Network for 3D LiDAR Place Recognition

OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition

PT-Net: Pyramid Transformer Network for Feature Matching Learning

OverlapTransformer: An Efficient and Yaw-Angle-Invariant Transformer Network for LiDAR-Based Place Recognition

Hybrid CNN-Transformer Features for Visual Place Recognition

Spatial Pyramid-Enhanced NetVLAD With Weighted Triplet Loss for Place Recognition

Multidirection and Multiscale Pyramid in Transformer for Video-Based Pedestrian Retrieval

SC_LPR: Semantically Consistent LiDAR Place Recognition Based on Chained Cascade Network in Long-Term Dynamic Environments

Pyramid Architecture for Multi-Scale Processing in Point Cloud Segmentation

Multi-direction and Multi-scale Pyramid in Transformer for Video-based Pedestrian Retrieval

Semantics-enhanced discriminative descriptor learning for LiDAR-based place recognition

Attention-based Pyramid Aggregation Network for Visual Place Recognition