Abstract:Three-dimensional (3D) reconstruction from two-dimensional images is an active research field in computer vision, with applications ranging from navigation and object tracking to segmentation and three-dimensional modeling. Traditionally, parametric techniques have been employed for this task. However, recent advancements have seen a shift towards learning-based methods. Given the rapid pace of research and the frequent introduction of new image matching methods, it is essential to evaluate them. In this paper, we present a comprehensive evaluation of various image matching methods using a structure-from-motion pipeline. We assess the performance of these methods on both in-domain and out-of-domain datasets, identifying key limitations in both the methods and benchmarks. We also investigate the impact of edge detection as a pre-processing step. Our analysis reveals that image matching for 3D reconstruction remains an open challenge, necessitating careful selection and tuning of models for specific scenarios, while also highlighting mismatches in how metrics currently represent method performance.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to evaluate and analyze the limitations of current image - matching methods and their benchmark tests in 3D reconstruction. Specifically, it attempts to solve the following key problems: 1. **Limitations of existing image - matching methods**: - The paper points out that although deep learning (DL) methods have made significant progress in image matching in recent years, these methods still face challenges when dealing with out - of - domain data. For example, when the image features are significantly different from the training data (such as transparent objects or extreme illumination changes), existing image - matching methods may perform poorly. 2. **Deficiencies in benchmark tests**: - Most of the existing benchmark test datasets only cover images in specific scenarios (such as outdoor scenes photographed during the day), lacking diversity and complexity. This makes it difficult for models to generalize to new, unseen scenarios in practical applications. 3. **The influence of edge detection**: - The paper also explores the influence of edge detection as a pre - processing step on the performance of different image - matching methods. By using methods such as DexiNed to extract edge maps, the researchers hope to understand whether this pre - processing step can improve the effect of image matching. 4. **Improvement of evaluation metrics**: - The paper emphasizes the limitations of current evaluation metrics (such as mAA) in reflecting method performance and proposes improvement suggestions. In particular, it discusses how to report evaluation metrics more clearly and consistently to enhance the comparability between different methods. ### Main contributions To achieve the above goals, the paper makes the following main contributions: 1. **Comprehensive comparison of the latest image - matching methods**: - It compares 20 state - of - the - art image - matching methods, 8 of which were proposed in 2024, and discusses their generalization ability and the limitations of current datasets. 2. **Evaluation of the influence of edge detection**: - An edge - detection step is introduced in the structure - from - motion (SfM) pipeline, and experiments are carried out using DexiNed to analyze its influence on traditional and deep - learning - based image - matching methods. 3. **Evaluation of the mAA metric**: - An in - depth evaluation of the mAA metric is carried out, the influence of unregistered images on the metric results is explored, and more clear and consistent metric - reporting suggestions are proposed. Through these efforts, the paper provides valuable insights for future research, points out the problems existing in current image - matching methods and benchmark tests, and provides directions for improving these methods.

Mismatched: Evaluating the Limits of Image Matching Approaches and Benchmarks

Grounding Image Matching in 3D with MASt3R

Image Matching Across Wide Baselines: From Paper to Practice

Hybrid-MVS: Robust Multi-View Reconstruction with Hybrid Optimization of Visual and Depth Cues

From 2D to 3D: Re-thinking Benchmarking of Monocular Depth Prediction

Beyond Complete Shapes: A quantitative Evaluation of 3D Shape Matching Algorithms

An Accurate And Efficient Monocular Mixed Match Slam

Research on 3D virtual vision matching based on interactive color segmentation

Visual Odometry Based 3D-Reconstruction

Visual Autonomy via 2D Matching in Rendered 3D Models

Matchable Image Retrieval by Learning from Surface Reconstruction

OPEN-SOURCE IMAGE-BASED 3D RECONSTRUCTION PIPELINES: REVIEW, COMPARISON AND EVALUATION

A Heterogeneous Remote Sensing Image Matching Method for Urban Areas With Complex Terrain Based on 3D Spatial Relationship Constraints

3D spatial measurement for model reconstruction: A review

Comparative Evaluation of 3D Reconstruction Methods for Object Pose Estimation

Unveiling limitations of 3D object reconstruction models through a novel benchmark

Monocular Image-Based 3-D Model Retrieval: A Benchmark

3D Reconstruction from a Single Still Image Based on Monocular Vision of an Uncalibrated Camera

Monocular visual SLAM, visual odometry, and structure from motion methods applied to 3D reconstruction: A comprehensive survey