Super-Resolution for Monocular Depth Estimation with Multi-Scale Sub-Pixel Convolutions and a Smoothness Constraint.

Shiyu Zhao,Lin Zhang,Ying Shen,Shengjie Zhao,Huijuan Zhang
DOI: https://doi.org/10.1109/access.2019.2894651
IF: 3.9
2019-01-01
IEEE Access
Abstract:Depth estimation from a monocular image is of paramount importance in various vision tasks, such as obstacle detection, robot navigation, and 3D reconstruction. However, how to get an accurate depth map with clear details and a fine resolution remains an unresolved issue. As an attempt to solve this problem, we exploit image super-resolution concepts and techniques for monocular depth estimation and propose a novel CNN-based approach, namely $MSCN_{NS}$ , which involves multi-scale sub-pixel convolutions and a neighborhood smoothness constraint. Specifically, $MSCN_{NS}$ makes use of sub-pixel convolutions with multi-scale fusions to retrieve a high-resolution depth map with fine details of the scene. Different from previous multi-scale fusion strategies, those multi-scale features come from supervised scale branches of the network. Furthermore, $MSCN_{NS}$ incorporates a neighborhood smoothness regularization term to make sure that spatially closer pixels with similar features would have close depth values. The effectiveness and efficiency of $MSCN_{NS}$ have been corroborated through extensive experiments conducted on benchmark datasets.
What problem does this paper attempt to address?