RGM: A Robust Generalist Matching Model.
Songyan Zhang,Xinyu Sun,Hao Chen,Bo Li,Chunhua Shen
DOI: https://doi.org/10.48550/arxiv.2310.11755
2023-01-01
Abstract:Finding corresponding pixels within a pair of images is a fundamentalcomputer vision task with various applications. Due to the specificrequirements of different tasks like optical flow estimation and local featurematching, previous works are primarily categorized into dense matching andsparse feature matching focusing on specialized architectures along withtask-specific datasets, which may somewhat hinder the generalizationperformance of specialized models. In this paper, we propose a deep model forsparse and dense matching, termed RGM (Robust Generalist Matching). Inparticular, we elaborately design a cascaded GRU module for refinement byexploring the geometric similarity iteratively at multiple scales following anadditional uncertainty estimation module for sparsification. To narrow the gapbetween synthetic training samples and real-world scenarios, we build a new,large-scale dataset with sparse correspondence ground truth by generatingoptical flow supervision with greater intervals. As such, we are able to mix upvarious dense and sparse matching datasets, significantly improving thetraining diversity. The generalization capacity of our proposed RGM is greatlyimproved by learning the matching and uncertainty estimation in a two-stagemanner on the large, mixed data. Superior performance is achieved for zero-shotmatching and downstream geometry estimation across multiple datasets,outperforming the previous methods by a large margin.