Towards Better Data Exploitation in Self-Supervised Monocular Depth Estimation

Jinfeng Liu,Lingtong Kong,Jie Yang,Wei Liu
DOI: https://doi.org/10.1109/lra.2023.3337594
IF: 5.2
2024-01-01
IEEE Robotics and Automation Letters
Abstract:Depth estimation plays an important role in robotic perception systems. The self-supervised monocular paradigm has gained significant attention since it can free training from the reliance on depth annotations. Despite recent advancements, existing self-supervised methods still underutilize the available training data, limiting their generalization ability. In this letter, we take two data augmentation techniques, namely Resizing-Cropping and Splitting-Permuting , to fully exploit the potential of training datasets. Specifically, the original image and the generated two augmented images are fed into the training pipeline simultaneously and we leverage them to conduct self-distillation. Additionally, we introduce the detail-enhanced DepthNet with an extra full-scale branch in the encoder and a grid decoder to enhance the restoration of fine details in depth maps. Experimental results demonstrate our method can achieve state-of-the-art performance on the KITTI and Cityscapes datasets. Moreover, our KITTI models also show superior generalization performance when transferring to Make3D, NYUv2 and Cityscapes datasets.
What problem does this paper attempt to address?