NAC: Mitigating Noisy Correspondence in Cross-Modal Matching Via Neighbor Auxiliary Corrector.

Yuqing Li,Haoming Huang,Jian Xu,Shao-Lun Huang
DOI: https://doi.org/10.1109/ICASSP48485.2024.10448059
2024-01-01
Abstract:The presence of noisy correspondence within cross-modal matching has significantly undermined the performance of existing matching methods. In this paper, we introduce a robust framework named Neighbor Auxiliary Corrector (NAC) for alleviating noise by utilizing the neighbors, which are indicative of similar textual targets. NAC is inspired by an observation that similar texts tend to correspond to similar images. Leveraging the zero-shot capabilities of Pre-trained Language Models (PLMs), we identify the top-k nearest neighbors for each positive image-text pair. Subsequently, the side information provided by these neighbors is harnessed for both sample verification and sample rectification. Extensive experiments on benchmark datasets demonstrate that our framework can significantly boost the performance and is more robust to various levels of noisy correspondence.
What problem does this paper attempt to address?