TinyStereo: A Tiny Coarse-to-Fine Framework for Vision-Based Depth Estimation on Embedded GPUs

Qiong Chang,Xin Xu,Aolong Zha,Meng Joo Er,Yongqing Sun,Yun Li
DOI: https://doi.org/10.1109/tsmc.2024.3395464
2024-07-19
IEEE Transactions on Systems Man and Cybernetics Systems
Abstract:Stereo vision, a popular depth estimation technology in computing vision, finds wide-ranging applications in embedded systems, including robotics vision and autonomous driving. These applications demand both high accuracy and fast processing speeds. To address hardware limitations, most current embedded systems rely on nonlearning algorithms for fast matching, sacrificing accuracy. Some recent studies have explored using convolutional neural networks (CNNs) to improve matching accuracy, but the computational load of existing learning-based systems hampers real-world applicability. This article presents significant contributions: 1) a novel stereo matching framework that greatly enhances accuracy on real-time embedded platforms and 2) a two-pronged approach combining a nonlearning-based algorithm and a lightweight super-resolution residual neural network (sRRNet). The nonlearning-based algorithm yields a low-resolution disparity map, while the lightweight sRRNet generates a high-resolution disparity map. Experimental results on benchmark data demonstrate that the proposed method achieves a low matching error rate of 5.17% and a real-time processing speed of 51 fps using the embedded Jetson AGX GPU. The proposed method outperforms all existing real-time embedded systems.
automation & control systems,computer science, cybernetics
What problem does this paper attempt to address?