Monocular image depth estimation using dilated convolution and spatial pyramid polling structure

Yinzhang Ding,Lu Lin,Lianghao Wang,Ming Zhang,Dongxiao Li,Haojie Ma
DOI: https://doi.org/10.1117/12.2524357
2019-01-01
Abstract:In this work, we address the problem of depth estimation from a single image. This is a challenging task because a single still image on its own does not give much depth cue, while recent advances in CNNs have made learning and predicting depth from a single image possible. We propose a new residual convolutional neural network (CNN) with dilated convolution and spatial pyramid pooling (SPP) structure to model the ambiguous mapping from a monocular 2D image to its depth map. The advantages of our method come from the use of dilated convolution and multi spatial scale information. Compared with existing deep CNN based methods, our method achieves much better results in indoor and outdoor scenarios.
What problem does this paper attempt to address?