SemFE:A Feature Matching Method for Learnable Local Semantic Feature Enhancement in Multimodal Images

Rongrui Teng,Yun Liao,Wei Wang,Qing Duan,Junhui Liu,Fangwei Jin,Yunpeng Li,Xu Qian
DOI: https://doi.org/10.21203/rs.3.rs-5275064/v1
2024-01-01
Abstract:Multimodal image matching has been a critical challenge within the field of computer vision over an extended period. In recent years, detector free methods have received widespread attention for achieving high matching accuracy. However, these methods often fail to fully exploit the semantic information within images, which is crucial for achieving accurate matching. To overcome this limitation, we propose SemFE, a multimodal image matching framework that leverages learnable local semantic feature enhancement. In SemFE, we designed a dynamic semantic feature extraction module to capture semantic information, along with a semantic information enhancement module to refine semantically related features. Additionally, we developed a multi-scale feature fusion backbone integrated with a transformer to further enhance feature extraction. Comprehensive experimental results demonstrate that SemFE consistently outperforms competing methods across various multimodal image datasets.
What problem does this paper attempt to address?