Abstract:In two-view correspondence learning, prevalent multi-layer perceptron (MLP)-based methods struggle with context capturing. To remedy this issue, recent advances innovatively stack convolutional neural network (CNN)-based Resblocks sequentially, showing an inherent proficiency in local context extraction. Yet, such non-issue-specific designs inherit the drawback of CNN’s difficulty in aggregating global context, leading to performance bottlenecks. To address this problem, this prospective study further explores the potential of the CNN-based framework and proposes MC-Net, a top-performing network that integrates both local and global context elegantly and seamlessly. Specifically, considering that sparse motion vectors and a dense motion field can be converted into each other through interpolation and sampling, we first transform unordered matches into image-structured data by estimating the dense motion field implicitly. Then, we design a hierarchical rectifying module to rectify the error of each ordered motion vector with CNN at multiple levels, enabling MC-Net to perceive global context from coarse-level features and local context from fine-level features simultaneously, which facilitates to tackle the discontinuities of the motion field in case of large scene disparity. Finally, we reconstruct comprehensive context-embedded features from rectified motion fields at all levels. Also, instead of using the residuals between rectified and pre-rectified motion vectors at the same layer to reject outliers as in previous studies, which seriously affects the inlier prediction accuracy, we rethink this operation meticulously and modify it to the difference between motion vectors obtained from each layer’s reconstruction and ones from the first layer before transformation, ensuring purer residuals and enhancing the matching performance without extra computational burden. Extensive experiments show that MC-Net outperforms state-of-the-arts on multiple domains and datasets.

TrGa: Reconsidering the Application of Graph Neural Networks in Two-View Correspondence Pruning

Graph Context Transformation Learning for Progressive Correspondence Pruning

NCMNet: Neighbor Consistency Mining Network for Two-View Correspondence Pruning

Seed to Prune: A Seeded Graph Neural Network for Two-View Correspondence Learning

RANet: A relation-aware network for two-view correspondence learning

Two-View Correspondence Learning with Local Consensus Transformer

Searching Lottery Tickets in Graph Neural Networks: A Dual Perspective

BCLNet: Bilateral Consensus Learning for Two-View Correspondence Pruning

MGNet: Learning Correspondences via Multiple Graphs

MC-Net: Integrating Multi-level Geometric Context for Two-view Correspondence Learning

A Unified Lottery Ticket Hypothesis for Graph Neural Networks

When Transformer Meets Large Graphs: An Expressive and Efficient Two-View Architecture

Enhancing two-view correspondence learning by local-global self-attention

Learning Two-View Correspondences and Geometry Using Order-Aware Network

Training Sparse Graph Neural Networks Via Pruning and Sprouting

DyGNN: Algorithm and Architecture Support of Dynamic Pruning for Graph Neural Networks

Pruning graph neural networks by evaluating edge properties

Multi-Stage Network With Geometric Semantic Attention for Two-View Correspondence Learning

Rethinking Graph Lottery Tickets: Graph Sparsity Matters

Multi-View Tensor Graph Neural Networks Through Reinforced Aggregation

Fast Track to Winning Tickets: Repowering One-Shot Pruning for Graph Neural Networks