Abstract:Among all stereo matching methods End‐to‐End (E2E) learning methods show that they can achieve the lowest and most frequent minimum errors, however, their performance drastically changes across different test‐sites, which indicates poor generalisation capabilities. Traditional and similarity‐learning stereo matching methods can achieve acceptable errors and can generalise well across different datasets and test‐sites. Unlike traditional methods (like Census‐SGM), deep learning methods are found to be more robust towards the configuration parameters of the stereo pair images. Deep‐learning (DL) stereomatching methods gained great attention in remote sensing satellite datasets. However, most of these existing studies conclude assessments based only on a few/single stereo‐images lacking a systematic evaluation on how robust DL methods are on satellite stereo‐images with varying radiometric and geometric configurations. This paper provides an evaluation of four DL stereomatching methods through hundreds of multi‐date multi‐site satellite stereopairs with varying geometric configurations, against the traditional well‐practiced Census‐semi‐global matching (SGM), to comprehensively understand their accuracy, robustness, generalisation capabilities, and their practical potential. The DL methods include a learning‐based cost metric through convolutional neural networks (MC‐CNN) followed by SGM, and three end‐to‐end (E2E) learning models using Geometry and Context Network (GCNet), Pyramid Stereo Matching Network (PSMNet), and LEAStereo. Our experiments show that E2E algorithms can achieve upper limits of geometric accuracies, while may not generalise well for unseen data. The learning‐based cost metric and Census‐SGM are rather robust and can consistently achieve acceptable results. All DL algorithms are robust to geometric configurations of stereopairs and are less sensitive in comparison to the Census‐SGM, while learning‐based cost metrics can generalise on satellite images when trained on different datasets (airborne or ground‐view). Résumé Les méthodes de mise en correspondance stéréo par apprentissage profond (DL) ont suscité une grande attention pour les données de télédétection par satellite. Cependant, la plupart des études existantes conduisent à des évaluations basées sur peu d ́images stéréo, sans évaluation systématique de la robustesse des méthodes par DL sur des images stéréo acquises avec des configurations radiométriques et géométriques variables. Cet article présente une évaluation de quatre méthodes d'appariement stéréo par DL, basée sur des centaines de couples stéréo multi‐dates et multi‐sites avec des configurations géométriques variées, en comparaison avec la méthode traditionnelle largement utilisée Census‐SGM (Semi‐global matching), afin de bien comprendre leur précision, leur robustesse, leur capacité de généralisation et leur potentiel d ́application pratique. Les méthodes par DL comprennent une métrique de coût basée sur l'apprentissage utilisant des réseaux neuronaux convolutifs (MC‐CNN) avant SGM, et trois modèles d'apprentissage de bout en bout (E2E) utilisant Geometry and Context Network (GCNet), Pyramid Stereo Matching Network (PSMNet) et LEAStereo. Nos expérimentations montrent que les algorithmes E2E peuvent atteindre les limites supérieures de la précision géométrique, mais qu'ils peuvent être difficiles à généraliser à des données n ́ayant pas servi à l ́apprentissage. La métrique de coût basée sur l'apprentissage et le Census‐SGM sont plutôt robustes et donnent des résultats acceptables. Tous les algorithmes par DL sont robustes vis‐à‐vis des configurations géométriques des couples stéréo et sont moins sensibles que le Census‐SGM, tandis que la métrique de coût basée sur l'apprentissage peut être généralisée aux images satellite même entraînée sur des données aériennes ou terrestres. Zusammenfassung Deep Learning (DL)‐Stereo‐Matching‐Methoden haben in Fernerkundungssatellitendatensätzen große Aufmerksamkeit erlangt. Die meisten dieser bestehenden Studien schließen jedoch Bewertungen ab, die nur auf wenigen/einzelnen Stereobildern basieren, und es fehlt eine systematische Bewertung, wie robust DL‐Methoden auf Satellitenstereobildern mit unterschiedlichen radiometrischen und geometrischen Konfigurationen sind. Dieses Papier bietet eine Bewertung von vier DL‐Stereo‐Matching‐Methoden durch Hunderte von Satelliten‐Stereopaare mit mehreren Daten und mehreren Standorten mit unterschiedlichen geometrischen Konfigurationen, gegen das traditionelle Geübte Census‐SGM (Semi‐globales Matching), um ihre Genauigkeit, Robustheit, Verallgemeinerungsfähigkeit und ihr praktisches Potenzial umfassend zu verstehen. Die DL‐Methoden umfassen eine lernbasierte Kostenmetrik durch Convolutional Neural Networks (MC‐CNN), gefolgt von SGM, und drei End‐to‐End (E2E)‐Lernmodelle mit Geometry and Context Network (GCNet), Pyramid Stereo Matching Network (PSMNet). -Abstract Truncated-

A comparative study on deep‐learning methods for dense image matching of multi‐angle and multi‐date remote sensing stereo‐images

FINE-TUNING DEEP LEARNING MODELS FOR STEREO MATCHING USING RESULTS FROM SEMI-GLOBAL MATCHING

DEEP LEARNING-BASED STEREO MATCHING FOR HIGH-RESOLUTION SATELLITE IMAGES: A COMPARATIVE EVALUATION

Disparity Estimation Using Multilevel and Global Information

Deep Learning Meets Satellite Images -- An Evaluation on Handcrafted and Learning-based Features for Multi-date Satellite Stereo Images

Stacking Learning with Coalesced Cost Filtering for Accurate Stereo Matching

An evaluation of conventional and deep learning‐based image‐matching methods on diverse datasets

State of the art in dense image matching cost computation for high-resolution satellite stereo

End-to-End Edge-Guided Multi-Scale Matching Network for Optical Satellite Stereo Image Pairs

An evaluation of Deep Learning based stereo dense matching dataset shift from aerial images and a large scale stereo dataset

Ensemble Learning with Advanced Fast Image Filtering Features for Semi-Global Matching

Robust Cost Volume Generation Method for Dense Stereo Matching in Endoscopic Scenarios

Stereo Matching Method with Integrated Geometric Encoding for Disparity Refinement

Sparse LIDAR Measurement Fusion with Joint Updating Cost for Fast Stereo Matching

Towards accurate binocular vision of satellites: A Cascaded Multi-Scale Pyramid Network for stereo matching on satellite imagery

An end-to-end stereo matching algorithm based on improved convolutional neural network

Learning Inter- and Intra-frame Representations for Non-Lambertian Photometric Stereo

Learning Local Event-based Descriptor for Patch-based Stereo Matching

Stereo Matching Method for Remote Sensing Images Based on Attention and Scale Fusion

Review of Stereo Matching Algorithms Based on Deep Learning