Hypercomplex context guided interaction modeling for scene graph generation
Zheng Wang,Xing Xu,Yadan Luo,Guoqing Wang,Yang Yang
DOI: https://doi.org/10.1016/j.patcog.2023.109634
IF: 8
2023-05-07
Pattern Recognition
Abstract:Intuitively, humans can consciously and subjectively attend to the interactions between objects, and thus infer reasonable visual relations. However, mainstream approaches of Scene Graph Generation (SGG) strive to alleviate the long-tailed distribution problem with various complicated re-weighting strategies, where a simple concatenation of the refined object features is treated as the final representation of visual relations. In spite of their remarkable progress, such an operation overlooks the importance of interaction on relation recognition. To tackle the problem, this work devises a hyper C omplex- C ontext guided I nteraction M odeling (CCIM for short) plug-in, which can be successfully assimilated by the existing methods for performance improvement . Specifically, we first extract the contextual relation feature determined by the constraint relation≈union(head,tail)−headobject−tailobject . Then, we encode the features of relations and objects into hypercomplex space, with three imaginary components, to learn more expressive representations for SGG. Next, guided by the context, we can capture the interaction between a head or tail object and their relation through the Hamilton product. We further reinforce the interaction between enhanced hypercomplex-valued representations of the two entities with Quaternion inner product. At last, the concatenation of all components from the learned hypercomplex feature is adopted as our final relation representation. Extensive experiments on the popular benchmark Visual Genome in various existing approaches demonstrate the effectiveness and generalization of our proposed model-agnostic method under comprehensive evaluation metrics.
computer science, artificial intelligence,engineering, electrical & electronic