TKwinFormer: Top k Window Attention in Vision Transformers for Feature Matching

Yun Liao,Yide Di,Hao Zhou,Kaijun Zhu,Mingyu Lu,Yijia Zhang,Qing Duan,Junhui Liu
2023-08-29
Abstract:Local feature matching remains a challenging task, primarily due to difficulties in matching sparse keypoints and low-texture regions. The key to solving this problem lies in effectively and accurately integrating global and local information. To achieve this goal, we introduce an innovative local feature matching method called TKwinFormer. Our approach employs a multi-stage matching strategy to optimize the efficiency of information interaction. Furthermore, we propose a novel attention mechanism called Top K Window Attention, which facilitates global information interaction through window tokens prior to patch-level matching, resulting in improved matching accuracy. Additionally, we design an attention block to enhance attention between channels. Experimental results demonstrate that TKwinFormer outperforms state-of-the-art methods on various benchmarks. Code is available at: <a class="link-external link-https" href="https://github.com/LiaoYun0x0/TKwinFormer" rel="external noopener nofollow">this https URL</a>.
Image and Video Processing
What problem does this paper attempt to address?