Neural Implicit Representation for Highly Dynamic LiDAR Mapping and Odometry

Qi Zhang,He Wang,Ru Li,Wenbin Li
2024-09-26
Abstract:Recent advancements in Simultaneous Localization and Mapping (SLAM) have increasingly highlighted the robustness of LiDAR-based techniques. At the same time, Neural Radiance Fields (NeRF) have introduced new possibilities for 3D scene reconstruction, exemplified by SLAM systems. Among these, NeRF-LOAM has shown notable performance in NeRF-based SLAM applications. However, despite its strengths, these systems often encounter difficulties in dynamic outdoor environments due to their inherent static assumptions. To address these limitations, this paper proposes a novel method designed to improve reconstruction in highly dynamic outdoor scenes. Based on NeRF-LOAM, the proposed approach consists of two primary components. First, we separate the scene into static background and dynamic foreground. By identifying and excluding dynamic elements from the mapping process, this segmentation enables the creation of a dense 3D map that accurately represents the static background only. The second component extends the octree structure to support multi-resolution representation. This extension not only enhances reconstruction quality but also aids in the removal of dynamic objects identified by the first module. Additionally, Fourier feature encoding is applied to the sampled points, capturing high-frequency information and leading to more complete reconstruction results. Evaluations on various datasets demonstrate that our method achieves more competitive results compared to current state-of-the-art approaches.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to solve the problem that existing SLAM (Simultaneous Localization and Mapping) systems based on NeRF (Neural Radiance Field) are difficult to accurately reconstruct 3D scenes in highly dynamic outdoor environments. Specifically, these systems usually assume that the environment is static or only slightly dynamic, which leads to their poor performance when dealing with real - world outdoor scenes with a large number of moving objects, and the scene reconstruction becomes inaccurate. To solve these problems, the paper proposes a new method aiming to improve 3D reconstruction and localization in highly dynamic outdoor scenes. The main innovations of this method include: 1. **Separation of Background and Foreground**: By dividing the scene into a static background and a dynamic foreground, identifying and excluding dynamic elements, a dense 3D map containing only the static background is created. 2. **Multi - resolution Octree Structure**: The octree structure in NeRF - LOAM is extended to support multi - resolution representation, which not only improves the reconstruction quality but also helps to remove the dynamic objects identified by the first module. 3. **Fourier Feature Encoding**: Fourier feature encoding is applied to sampling points to capture high - frequency information, thus achieving more complete reconstruction results. ### Mathematical Formulas Some of the key formulas involved in the paper are as follows: - **Calculation of Average Ground Height**: \[ \bar{z}_G=\frac{1}{|P_G|}\sum_{p\in P_G}z_p \] where \(P_G\) is the set of ground points within a radius \(r\) of the dynamic mask \(D_i\), and \(z_p\) is the \(z\)-coordinate of each point \(p\). - **SDF Loss Function**: \[ L_d = \frac{1}{|D_i|}\sum_{p_j\in D_i}\left(\Psi(p_j)-\frac{1}{|D_i|}\sum_{p_k\in D_i}\Psi(p_k)\right)^2 \] where \(\Psi(p_j)\) is the SDF value of point \(p_j\) in the dynamic region \(D_i\). - **Multi - resolution Encoding**: \[ F_s^\alpha(p)=\sum_{j = D_{\text{max}}-H + 1}^{D_{\text{max}}}F_j^\alpha(p) \] where \(F_j^\alpha(p)\) is the embedding obtained by trilinear interpolation through the eight vertices of the current - layer node. - **Fourier Feature Position Encoding**: \[ \gamma(p)=[\sin(2\pi B_1p),\cos(2\pi B_1p),\ldots,\sin(2\pi B_kp),\cos(2\pi B_kp)]^{\top} \] where \(B_i\) is a coefficient sampled from an isotropic Gaussian distribution \(N(0,\sigma^2)\). - **Total Loss Function**: \[ L_{\text{total}}=\lambda_sL_s+\lambda_fL_f+\lambda_eL_e+\lambda_dL_d \] where \(L_s\), \(L_f\), \(L_e\) and \(L_d\) are the SDF loss, free - space loss, Eikonal loss and dynamic - region SDF loss respectively, and \(\lambda_s\), \(\lambda_f\), \(\lambda_e\) and \(\lambda_d\) are weight parameters. Through these improvements, this method can achieve more accurate and complete 3D reconstruction in highly dynamic outdoor environments.