Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting

Yunzhi Yan,Haotong Lin,Chenxu Zhou,Weijie Wang,Haiyang Sun,Kun Zhan,Xianpeng Lang,Xiaowei Zhou,Sida Peng
2024-08-18
Abstract:This paper aims to tackle the problem of modeling dynamic urban streets for autonomous driving scenes. Recent methods extend NeRF by incorporating tracked vehicle poses to animate vehicles, enabling photo-realistic view synthesis of dynamic urban street scenes. However, significant limitations are their slow training and rendering speed. We introduce Street Gaussians, a new explicit scene representation that tackles these limitations. Specifically, the dynamic urban scene is represented as a set of point clouds equipped with semantic logits and 3D Gaussians, each associated with either a foreground vehicle or the background. To model the dynamics of foreground object vehicles, each object point cloud is optimized with optimizable tracked poses, along with a 4D spherical harmonics model for the dynamic appearance. The explicit representation allows easy composition of object vehicles and background, which in turn allows for scene editing operations and rendering at 135 FPS (1066 $\times$ 1600 resolution) within half an hour of training. The proposed method is evaluated on multiple challenging benchmarks, including KITTI and Waymo Open datasets. Experiments show that the proposed method consistently outperforms state-of-the-art methods across all datasets. The code will be released to ensure reproducibility.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to efficiently model dynamic urban streets in the autonomous driving scenario. Existing methods achieve photo - realistic view synthesis of dynamic urban street scenes by extending NeRF (Neural Radiance Field) and combining with the tracked vehicle poses. However, these methods have significant limitations in slow training and rendering speeds. To address these issues, the authors propose Street Gaussians, a new explicit scene representation method aimed at improving training and rendering efficiency while maintaining high - quality view - synthesis capabilities. Specifically, the main contributions in the paper include: 1. **Proposing Street Gaussians**: A new scene representation method for modeling complex dynamic urban scenes, which can efficiently reconstruct and real - time render high - fidelity urban street scenes. 2. **Introducing multiple strategies**: Including 4D spherical harmonic appearance models, tracked pose optimization, and point - cloud initialization, etc. These strategies significantly improve the rendering performance of Street Gaussians. 3. **Experimental verification**: Evaluations were carried out on multiple challenging benchmark datasets (such as KITTI and Waymo Open), and the results show that this method is superior to existing methods in terms of rendering quality, and the rendering speed is increased by more than 100 times. Through these contributions, the paper aims to provide an efficient and high - quality dynamic urban street modeling method for the autonomous driving simulation environment.