Search-Based Depth Estimation Via Coupled Dictionary Learning With Large-Margin Structure Inference

Yan Zhang,Rongrong Ji,Xiaopeng Fan,Yan Wang,Feng Guo,Yue Gao,Debin Zhao
DOI: https://doi.org/10.1007/978-3-319-46454-1_52
2016-01-01
Abstract:Depth estimation from a single image is an emerging topic in computer vision and beyond. To this end, the existing works typically train a depth regressor from visual appearance. However, the state-of-the-art performance of these schemes is still far from satisfactory, mainly because of the over-fitting and under-fitting problems in regressor training. In this paper, we offer a different data-driven paradigm of estimating depth from a single image, which formulates depth estimation from a search-based perspective. In particular, we handle the depth estimation of local patches via a novel cross-modality retrieval scheme, which searches for the 3D patches with similar structure/appearance to the 2D query from a dataset with 2D-3D mappings. To that effect, a coupled dictionary learning formulation is proposed to link the 2D query with the 3D patches, on the reconstruction coefficients to capture the cross-modality similarity, to obtain a rough depth estimation locally. In addition, consistency on spatial context is further introduced to refine the local depth estimation using a Conditional Random Field. We demonstrate the efficacy of the proposed method by comparing it with the state-of-the-art approaches on popular public datasets such as Make3D and NYUv2, upon which significant performance gains are reported.
What problem does this paper attempt to address?