Generalizable Thermal-based Depth Estimation Via Pre-trained Visual Foundation Model

Fan Ruoyu,Zhao Wang,Lin Matthieu,Wang Qi,Liu Yong-Jin,Wang Wenping
DOI: https://doi.org/10.1109/icra57147.2024.10610394
2024-01-01
Abstract:Depth estimation is a crucial task in computer vision, applicable to various domains such as 3D reconstruction, robotics, and autonomous driving. In particular, thermal-based depth estimation has unique advantages, including night-time vision. However, the existing depth estimation method remains challenging in robust generalization due to limited data resources and spectral differences between thermal and RGB images. In this paper, we present a self-supervised approach to enhance thermal-based depth estimation by leveraging pre-trained visual models initially designed for RGB data. In detail, we design a novel two-stage training strategy, incorporating Low-rank Adapters and Convolutional Adapters, which not only significantly improves accuracy and robustness but also enables impressive zero-shot generalization capabilities. Our method outperforms existing thermal-based depth estimation models, opening new possibilities for cross-modal applications in computer vision and robotics research.
What problem does this paper attempt to address?