URCDC-Depth: Uncertainty Rectified Cross-Distillation with CutFlip for Monocular Depth Estimation
Shuwei Shao,Zhongcai Pei,Weihai Chen,Ran Li,Zhong Liu,Zhengguo Li
DOI: https://doi.org/10.1109/tmm.2023.3310259
IF: 7.3
2023-01-01
IEEE Transactions on Multimedia
Abstract:This work aims to estimate a high-quality depth map from a single RGB image. Due to the lack of depth clues, making full use of the long-range correlation and local information is critical for accurate depth estimation. To this end, we introduce an uncertainty rectified cross-distillation between the Transformer and convolutional neural network (CNN) to achieve a comprehensive depth estimator. Specifically, we utilize the depth estimates from the Transformer branch and CNN branch as pseudo labels to teach each other. At the same time, the pixel-wise depth uncertainty is modeled to mitigate the negative impact of noisy pseudo labels. To avoid the large capacity gap induced by the strong Transformer branch deteriorating the cross-distillation, we transfer the feature maps from the Transformer to the CNN and develop coupling units to assist the weak CNN branch in leveraging the transferred features. Furthermore, we introduce CutFlip, a surprisingly simple yet highly effective data augmentation technique, which forces the model to focus on more valuable depth reasoning clues apart from the vertical image position. Extensive experiments demonstrate that our model, termed URCDC-Depth, exceeds in performance previous state-of-the-art approaches on the KITTI, NYU-Depth-v2 and SUN RGB-D datasets, with no additional computational burden in the evaluation phase. The source code will be publicly available upon acceptance.
computer science, information systems,telecommunications, software engineering