Difference-Aware Iterative Reasoning Network for Key Relation Detection.

Bowen Zhao,Weidong Chen,Bo Hu,Hongtao Xie,Zhendong Mao
DOI: https://doi.org/10.1109/icme55011.2023.00055
2023-01-01
Abstract:Scene graph serves as a crucial visual representation of an image, with salient objects providing richer semantics for detecting key relations. However, most methods use a one-step reasoning manner for key relation detection, which may not utilize potential clues effectively. Humans usually review and revise to achieve the final answer, and semantics of relations offer further linguistic clues. Therefore, we propose the Difference-aware Iterative Reasoning Network (DIRNet) to predict key relations in a multi-step manner. Our model estimates visual saliency, encodes contexts globally with message passing, and then refines predictions iteratively by considering the difference in predicted relation semantics and contextual information across iterations. Extensive experiments show that our model outperforms state-of-the-art methods in key relation prediction on the VG-KR benchmark, and achieves competitive results in common relation prediction on VG, demonstrating its generalization and superiority.
What problem does this paper attempt to address?