KineDepth: Utilizing Robot Kinematics for Online Metric Depth Estimation

Soofiyan Atar,Yuheng Zhi,Florian Richter,Michael Yip

2024-09-29

Abstract:Depth perception is essential for a robot's spatial and geometric understanding of its environment, with many tasks traditionally relying on hardware-based depth sensors like RGB-D or stereo cameras. However, these sensors face practical limitations, including issues with transparent and reflective objects, high costs, calibration complexity, spatial and energy constraints, and increased failure rates in compound systems. While monocular depth estimation methods offer a cost-effective and simpler alternative, their adoption in robotics is limited due to their output of relative rather than metric depth, which is crucial for robotics applications. In this paper, we propose a method that utilizes a single calibrated camera, enabling the robot to act as a ``measuring stick" to convert relative depth estimates into metric depth in real-time as tasks are performed. Our approach employs an LSTM-based metric depth regressor, trained online and refined through probabilistic filtering, to accurately restore the metric depth across the monocular depth map, particularly in areas proximal to the robot's motion. Experiments with real robots demonstrate that our method significantly outperforms current state-of-the-art monocular metric depth estimation techniques, achieving a 22.1% reduction in depth error and a 52% increase in success rate for a downstream task.

Robotics,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper attempts to address the problem of achieving high-precision metric depth estimation in robotic operations. Traditionally, robots rely on hardware depth sensors (such as RGB-D cameras or stereo cameras) to obtain spatial and geometric information about the environment. However, these sensors have some practical limitations, including issues with handling transparent and reflective objects, high costs, complex calibration, and increased failure rates in composite systems. Although monocular depth estimation methods offer a more cost-effective and simpler alternative, their adoption in robotic applications is limited because they output relative depth rather than metric depth. The paper proposes a new method—KineDepth, which uses a single calibrated camera to enable the robot to act as a "measuring stick," converting relative depth estimation into metric depth in real-time. This method employs an LSTM-based metric depth regressor and uses probabilistic filtering for online training and optimization to accurately recover metric depth across the entire monocular depth map, especially in areas close to the robot's motion region. Experimental results show that this method reduces depth error by 22.1% compared to existing state-of-the-art monocular metric depth estimation techniques and increases the success rate in downstream tasks by 52%. This enables robots to perform high-precision operational tasks in novel, unstructured environments.

KineDepth: Utilizing Robot Kinematics for Online Metric Depth Estimation

A Depth Estimation Framework Based on Unsupervised Learning and Cross-Modal Translation

Depth Estimation with Ego-Motion Assisted Monocular Camera

UDepth: Fast Monocular Depth Estimation for Visually-guided Underwater Robots

Distilled Visual and Robot Kinematics Embeddings for Metric Depth Estimation in Monocular Scene Reconstruction

Radar Meets Vision: Robustifying Monocular Metric Depth Prediction for Mobile Robotics

Depth Map Estimation of Dynamic Scenes Using Prior Depth Information

MBUDepthNet: Real-Time Unsupervised Monocular Depth Estimation Method for Outdoor Scenes

Real-Time Monocular Human Depth Estimation and Segmentation on Embedded Systems

Metrically Scaled Monocular Depth Estimation through Sparse Priors for Underwater Robots

Towards Real-Time Monocular Depth Estimation for Robotics: A Survey

MobiDepth: Real-Time Depth Estimation Using On-Device Dual Cameras.

LightDepth: A resource efficient depth estimation approach for dealing with ground truth sparsity via curriculum learning

DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation

Probabilistic Multimodal Depth Estimation Based on Camera-LiDAR Sensor Fusion

Real-time Monocular Depth Estimation on Embedded Systems

KDepthNet: Mono-Camera Based Depth Estimation for Autonomous Driving

Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular Depth Estimation by Integrating IMU Motion Dynamics

The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation

Lifelong-MonoDepth: Lifelong Learning for Multidomain Monocular Metric Depth Estimation

Towards Scale-Aware Self-Supervised Multi-Frame Depth Estimation with IMU Motion Dynamics.