Coordinating explicit and implicit knowledge for knowledge-based VQA

Qunbo Wang,Jing Liu,Wenjun Wu
DOI: https://doi.org/10.1016/j.patcog.2024.110368
IF: 8
2024-02-28
Pattern Recognition
Abstract:Pre-trained models often generate plausible looking statements that are factually incorrect because of the inaccurate implicit knowledge contained in the model's parameters. Related methods retrieve explicit knowledge from the external knowledge source to help improve the prediction performance and reliability. However, these methods often use weak training signals for the retriever, and require the model to make each prediction based on the retrieved knowledge, even when the retrieved knowledge is not reliable or the model can produce better prediction only using its implicit knowledge. Therefore, it is necessary to enable the pre-trained model to actively select more beneficial knowledge for producing better prediction. This work proposes a novel method to help the model to C oordinate E xplicit and I mplicit K nowledge (CEIK) for the knowledge-based visual question answering (VQA) task, which is an important direction of pre-trained models. Furthermore, a better training signal is proposed for the retriever according to whether the retrieved knowledge can correct the prediction. Experimental results demonstrate the effectiveness of our method.
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?