A Symbolic-Neural Reasoning Model for Visual Question Answering

Jingying Gao,M. Pagnucco,A. Blair
DOI: https://doi.org/10.1109/IJCNN54540.2023.10191538
2023-06-18
Abstract:State-of-the-art Visual Question Answering (VQA) systems have demonstrated promising performance in solving visual relationship-based reasoning problems. However, they struggle in solving complex problems where the answers require sophisticated logical reasoning. In this paper, we introduce a hybrid symbolic-neural reasoning model that integrates deep neural network vision and language features with a symbolic reasoner connected to a knowledge base. The symbolic reasoner effectively combines visual and linguistic information with on-tological relationships and common-sense reasoning to address complex logical questions. We replace the multimodal fusion layer in traditional VQA deep neural networks with an innovative logical reasoning component, generating reasoned answers and clear logical inference chains. Moreover, we propose developing a notion of Question Difficulty, reflecting the logical complexity of VQA questions and their difficulty level in terms of being answered. Current VQA approaches excel at straightforward logic but struggle with increased question difficulty. Our hybrid method performs better as it has access to an additional logical reasoner through the knowledge base to produce answers that require logical inference. Experimental analysis of the answers and the key evidential predicates generated using our unique LoRA (Logical Reasoning Associated VQA) dataset are used to validate our approach and clearly demonstrate its advantages.
Computer Science
What problem does this paper attempt to address?