Multi-Scale Graph Attention Network for Scene Graph Generation

Min Chen,Xinyu Lyu,Yuyu Guo,Jingwei Liu,Lianli Gao,Jingkuan Song
DOI: https://doi.org/10.1109/icme52920.2022.9859970
2022-01-01
Abstract:Scene graph provides a high-level scene understanding of the image, which has a wide range of applications in computer vision. Previous methods elaborately design many message passing strategies and uniformly treat instances in the image to capture contextual information. These methods, however, fail to grasp the salient objects and their relations, which are the basis of understanding the content of images. To capture the interaction among salient instances, we propose a novel Multi-Scale Graph Attention Network (MSGAT) that gradually shrinks the graph scale to retain salient instances, and then expands it to encode the multi-scale context. Our proposed MSGAT contains two sub-modules: Multi-Scale Message Passing (MSMP) and Relationship Filtering Module (RFM), which are designed to enhance features of salient instances and filter redundant relationships, respectively. Extensive experiments demonstrate that MSGAT outperforms previous methods and achieves state-of-the-art performances on Visual Genome.
What problem does this paper attempt to address?