A synthetic digital city dataset for robustness and generalisation of depth estimation models

Jihao Li,Jincheng Hu,Yanjun Huang,Zheng Chen,Bingzhao Gao,Jingjing Jiang,Yuanjian Zhang
DOI: https://doi.org/10.1038/s41597-024-03025-5
2024-03-17
Scientific Data
Abstract:Existing monocular depth estimation driving datasets are limited in the number of images and the diversity of driving conditions. The images of datasets are commonly in a low resolution and the depth maps are sparse. To overcome these limitations, we produce a Synthetic Digital City Dataset (SDCD) which was collected under 6 different weather driving conditions, and 6 common adverse perturbations caused by the data transmission. SDCD provides a total of 930 K high-resolution RGB images and corresponding perfect observed depth maps. The evaluation shows that depth estimation models which are trained on SDCD provide a clearer, smoother, and more precise long-range depth estimation compared to those trained on one of the best-known driving datasets KITTI. Moreover, we provide a benchmark to investigate the performance of depth estimation models in different adverse driving conditions. Instead of collecting data from the real world, we generate the SDCD under severe driving conditions with perfect observed data in the digital world, enhancing depth estimation for autonomous driving.
multidisciplinary sciences
What problem does this paper attempt to address?
This paper attempts to solve several key problems in the existing monocular depth - estimation driving datasets: 1. **Limited number of images**: Existing datasets (such as KITTI, Make3D, etc.) contain a small number of images and cannot provide sufficient training samples. 2. **Insufficient driving condition diversity**: Existing datasets usually cover only limited driving conditions (such as weather, illumination), lacking support for complex and diverse driving scenarios. 3. **Low image resolution**: The image resolution of many existing datasets is low, making it difficult for depth - estimation models to obtain high - precision depth information. 4. **Sparse depth maps**: The depth maps in existing datasets are usually sparse and cannot comprehensively describe environmental information, which affects the performance of depth - estimation models. To solve these problems, the author introduced a new Synthetic Digital City Dataset (SDCD). The main features of SDCD are as follows: - **Large - scale, high - resolution images**: SDCD provides 930,000 high - resolution (1080×720) RGB images and their corresponding dense depth maps, with a total driving distance of 427.5 kilometers. - **Multiple driving conditions**: SDCD covers 6 different weather driving conditions (sunny, rainy, snowy, hailing, cloudy, sandy), as well as 6 common data transmission perturbations (noise, blurring, JPEG compression, color quantization, pixelation). - **Enhanced depth - estimation ability**: By generating perfect observation data in the virtual world, SDCD improves the clarity, smoothness and accuracy of depth - estimation models in long - distance depth estimation, especially in harsh driving conditions. Through these improvements, SDCD aims to provide more abundant and diverse training data for depth - estimation models, thereby enhancing their robustness and generalization ability in real - world applications. Especially for autonomous driving technology, SDCD can significantly improve the accuracy of depth - perception tasks. ### Formula summary The formulas involved in this paper are mainly used to describe image generation under different conditions and depth - estimation error evaluation: 1. **Rainy - day image model**: \[ I_{\text{rain}} = B + R_{\text{rain}} \] - \(I_{\text{rain}}\): The image observed on a rainy day - \(B\): Clean background information - \(R_{\text{rain}}\): Rainy - day - specific information 2. **Image model with only rain streaks**: \[ I = B + R_s \] 3. **Image model with only raindrops**: \[ I = (1 - M) \odot B + R_d \] - \(M\): Distortion mask - \(\odot\): Hadamard product - \(R_d\): Raindrop information 4. **Image model with both rain streaks and raindrops**: \[ I = (1 - M) \odot B + R_s + \rho R_d \] - \(\rho\): Global atmospheric illumination coefficient 5. **Gaussian noise model**: \[ A(x,y) = H(x,y) + B(x,y) \] - \(A(x,y)\): Noisy image - \(H(x,y)\): Noise information of each pixel - \(B(x,y)\): Information of each pixel of the original image 6. **Gaussian probability density function**: \[ p(z) = \frac{1}{\sqrt{2\pi}\sigma} e^{-\frac{(z - \mu)^2}{2\sigma^2}} \] - \(z\): Gray value of image pixels - \(\mu\): Average value of pixel values - \(\sigma\): Standard deviation of pixel values 7. **Depth - estimation performance evaluation index**: - **Scale Invariant Logarithmic E