Attention-Based Dense Decoding Network for Monocular Depth Estimation

Jianrong Wang,Ge Zhang,Mei Yu,Tianyi Xu,Tao Luo
DOI: https://doi.org/10.1109/ACCESS.2020.2990643
IF: 3.9
2020-01-01
IEEE Access
Abstract:Depth estimation is a classic computer vision task and provides rich representation of objects and environment. In recent years, the performance of end-to-end depth estimation has been significantly improved. However, the stack of convolutions and pooling operations result in losing local detail spatial information, which is extremely important to monocular depth estimation. In order to overcome this problem, in this work, we propose an encoder-decoder framework with skip connections. Based on the self-attention mechanism, we apply the channel-spatial attention module as a transition layer, which captures the depth and spatial positional relationship and improves the presentation ability of channel and space. Then we propose a dense decoding module to make full use of the attention features of different scale ranges in the decoding process. It achieves a more massive and denser receptive field while obtaining multi-scale information. Finally, a novel distance-aware loss is introduced to predict more meticulous edges and local details in the distance. Experiments demonstrate that the proposed method outperforms the state-of-the-art on KITTI and NYU Depth V2 datasets.
What problem does this paper attempt to address?