Fast Monocular Depth Estimation via Side Prediction Aggregation with Continuous Spatial Refinement
Jipeng Wu,Rongrong Ji,Qiang Wang,Shengchuan Zhang,Xiaoshuai Sun,Yan Wang,Mingliang Xu,Feiyue Huang
DOI: https://doi.org/10.1109/tmm.2021.3140001
IF: 7.3
2022-01-01
IEEE Transactions on Multimedia
Abstract:Recent works have validated the benefit of integrating spatial information into deep networks to improve pixel-level prediction tasks such as monocular depth estimation. However, how to efficiently and robustly integrate spatial cues retains as an open problem. In this paper, we introduce the Side Prediction Aggregation (termed SPA) method to enhance the embedding of scene structural information from low-level to high-level layers. To improve the estimation accuracy, the proposed method is further equipped with continuous Spatial Refinement Loss (termed SRL) at multiple resolutions with negligible extra computation. Besides, the proposed sequential network can further perform adversarial learning at multiple resolutions. Such an adversarial refinement strategy greatly improves the accuracy of estimated depth with a little extra computation. Without using any pre-trained models, our network achieves the the-state-of-art accuracy on KITTI, NYUD V2, and Cityscapes datasets, which has achieved real-time depth estimation online.
computer science, information systems,telecommunications, software engineering