Abstract:In recent years, convolutional-neural-network based stereo matching methods have achieved significant gains compared to conventional methods in terms of both speed and accuracy. Current state-of-the-art disparity estimation algorithms require many parameters and large amounts of computational resources and are not suited for applications on edge devices. In this paper, we propose an end-to-end light-weight network (LWNet) for fast stereo matching, which consists of an efficient backbone with multi-scale feature fusion for feature extraction, a 3D U-Net aggregation architecture for disparity computation, and color guidance in a 2D convolutional neural network (CNN) for disparity refinement. We adopt MobileNetV2 as an efficient backbone in feature extraction. The channel attention module is applied to improve the representational capacity of features and multi-resolution information is adaptively incorporated into the cost volume via cross-scale connections. In addition, instead of using regular 3D convolutions, we utilize pseudo 3D convolutions in the 3D U-Net architecture to aggregate the cost volume for a better balance between computational cost and accuracy. Further, we introduce a left-right consistency check and color guidance and design a robust disparity refinement network with skip connections and dilated convolutions to capture global context information and further improve disparity-estimation accuracy with little computational cost and memory space. A depth-wise separable convolution is proposed to replace all the standard convolutions in the section of disparity refinement, which can decrease computational complexity and the number of parameters without significant accuracy reduction. Extensive experiments on Scene Flow, KITTI 2015, and KITTI 2012 benchmarks demonstrate that the proposed LWNet achieves competitive accuracy when compared with state-of-the-art stereo matching methods.

GPDF-Net: geometric prior-guided stereo matching with disparity fusion refinement

Sparse LIDAR Measurement Fusion with Joint Updating Cost for Fast Stereo Matching

Stereo Matching Method with Integrated Geometric Encoding for Disparity Refinement

Guided aggregation and disparity refinement for real-time stereo matching

Improving Stereo Matching by Incorporating Geometry Prior into Convnet

UGNet: Uncertainty aware geometry enhanced networks for stereo matching

Superpixel Guided Network for Three-Dimensional Stereo Matching

Ensemble Learning with Advanced Fast Image Filtering Features for Semi-Global Matching

Improved real-time three-dimensional stereo matching with local consistency

Edge-preserving Guided Filtering Based Cost Aggregation for Stereo Matching.

CGFNet: 3D Convolution Guided and Multi-scale Volume Fusion Network for fast and robust stereo matching

GA-Stereo: A Real-Time Stereo Network Based on the Gradient Flow Shunting Strategy and the Atrous Pyramid Network

CGI-Stereo: Accurate and Real-Time Stereo Matching via Context and Geometry Interaction

Stereo matching algorithm using improved guided filtering

Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks

Parallax attention stereo matching network based on the improved group-wise correlation stereo network

Fast stereo matching using adaptive guided filtering

EGOF-Net: Epipolar Guided Optical Flow Network for Unrectified Stereo Matching

A Light-Weight Network with Multi-Scale Features Fusion and Color Guidance for Stereo Matching

A Joint 2D-3D Complementary Network for Stereo Matching

IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo Matching