S3Net: Innovating Stereo Matching and Semantic Segmentation with a Single-Branch Semantic Stereo Network in Satellite Epipolar Imagery

Qingyuan Yang,Guanzhou Chen,Xiaoliang Tan,Tong Wang,Jiaqi Wang,Xiaodong Zhang

DOI: https://doi.org/10.1109/IGARSS53475.2024.10640492

2024-10-01

Abstract:Stereo matching and semantic segmentation are significant tasks in binocular satellite 3D reconstruction. However, previous studies primarily view these as independent parallel tasks, lacking an integrated multitask learning framework. This work introduces a solution, the Single-branch Semantic Stereo Network (S3Net), which innovatively combines semantic segmentation and stereo matching using Self-Fuse and Mutual-Fuse modules. Unlike preceding methods that utilize semantic or disparity information independently, our method dentifies and leverages the intrinsic link between these two tasks, leading to a more accurate understanding of semantic information and disparity estimation. Comparative testing on the US3D dataset proves the effectiveness of our S3Net. Our model improves the mIoU in semantic segmentation from 61.38 to 67.39, and reduces the D1-Error and average endpoint error (EPE) in disparity estimation from 10.051 to 9.579 and 1.439 to 1.403 respectively, surpassing existing competitive methods. Our codes are available at:<a class="link-external link-https" href="https://github.com/CVEO/S3Net" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve The paper aims to address the issue of the independence between stereo matching and semantic segmentation tasks in satellite imagery. Traditional research methods usually treat these two tasks as independent, lacking an integrated multi-task learning framework. This leads to insufficient utilization of semantic information and disparity estimation, affecting the accuracy and robustness of the tasks. Specifically, the paper proposes the Single-branch Semantic Stereo Network (S3Net), which innovatively combines semantic segmentation and stereo matching tasks through Self-Fuse and Mutual-Fuse modules. This approach not only enhances the understanding of semantic information but also improves the accuracy of disparity estimation. Experimental results show that S3Net outperforms existing competitive methods on the US3D dataset, particularly achieving significant improvements in the mIoU metric for semantic segmentation and the D1-Error and average endpoint error (EPE) for disparity estimation.

S3Net: Innovating Stereo Matching and Semantic Segmentation with a Single-Branch Semantic Stereo Network in Satellite Epipolar Imagery

S$^3$M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving

3D Graph-S<SUP>2</SUP>Net: Shape-Aware Self-ensembling Network for Semi-supervised Segmentation with Bilateral Graph Convolution

Semantic Stereo for Incidental Satellite Images

3D Reconstruction and Semantic Segmentation Method Combining PointNet and 3D-Lmnet from Single Image

A Multitask Network for Multiview Stereo Reconstruction: When Semantic Consistency-Based Clustering Meets Depth Estimation Optimization

Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion

Superpixel Guided Network for Three-Dimensional Stereo Matching

Stereo Matching Method for Remote Sensing Images Based on Attention and Scale Fusion

SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from Monocular Images

S3Net: 3D LiDAR Sparse Semantic Segmentation Network

Collaborative Network for Super-Resolution and Semantic Segmentation of Remote Sensing Images

EAI-Stereo: Error Aware Iterative Network for Stereo Matching

Semantic 3d Reconstruction Using Multi-View High-Resolution Satellite Images Based On U-Net And Image-Guided Depth Fusion

Towards accurate binocular vision of satellites: A Cascaded Multi-Scale Pyramid Network for stereo matching on satellite imagery

A Joint 2D-3D Complementary Network for Stereo Matching

End-to-End Edge-Guided Multi-Scale Matching Network for Optical Satellite Stereo Image Pairs

Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation

A unified and efficient semi-supervised learning framework for stereo matching

Stereo Matching Method with Integrated Geometric Encoding for Disparity Refinement

A Unified Framework for 3D Scene Understanding