Subgraph and Object Context‐masked Network for Scene Graph Generation

Zhenxing Zheng,Zhendong Li,Gaoyun An,Songhe Feng
DOI: https://doi.org/10.1049/iet-cvi.2019.0896
IF: 1.484
2020-01-01
IET Computer Vision
Abstract:Scene graph generation is to recognise objects and their semantic relationships in an image and can help computers understand visual scene. To improve relationship prediction, geometry information is essential and usually incorporated into relationship features. Existing methods use coordinates of objects to encode their spatial layout. However, in this way, they neglect the context of objects. In this study, to take full use of spatial knowledge efficiently, the authors propose a novel subgraph and object context‐masked network (SOCNet) consisting of spatial mask relation inference (SMRI) and hierarchical message passing (HMP) modules to address the scene graph generation task. In particular, to take advantage of spatial knowledge, SMRI masks partial context of object features depending on their spatial layout of objects and corresponding subgraph to facilitate their relationship recognition. To refine the features of objects and subgraphs, they also propose HMP that passes highly correlated messages from both microcosmic and macroscopic aspects through a triple‐path structure including subgraph–subgraph, object–object, and subgraph–object paths. Finally, statistical co‐occurrence probability is used to regularise relationship prediction. SOCNet integrates HMP and SMRI into a unified network, and comprehensive experiments on visual relationship detection and visual genome datasets indicate that SOCNet outperforms several state‐of‐the‐art methods on two common tasks.
What problem does this paper attempt to address?