Multi-Scale Spatio-Temporal Feature Extraction And Depth Estimation From Sequences By Ordinal Classification

Yang Liu
DOI: https://doi.org/10.3390/s20071979
IF: 3.9
2020-01-01
Sensors
Abstract:Depth estimation is a key problem in 3D computer vision and has a wide variety of applications. In this paper we explore whether deep learning network can predict depth map accurately by learning multi-scale spatio-temporal features from sequences and recasting the depth estimation from a regression task to an ordinal classification task. We design an encoder-decoder network with several multi-scale strategies to improve its performance and extract spatio-temporal features with ConvLSTM. The results of our experiments show that the proposed method has an improvement of almost 10% in error metrics and up to 2% in accuracy metrics. The results also tell us that extracting spatio-temporal features can dramatically improve the performance in depth estimation task. We consider to extend this work to a self-supervised manner to get rid of the dependence on large-scale labeled data.
What problem does this paper attempt to address?