Exploring Hierarchical Spatial Layout Cues for 3D Point Cloud based Scene Graph Prediction

Mingtao Feng,Haoran Hou,Liang Zhang,Yulan Guo,Hongshan Yu,Yaonan Wang,Ajmal Mian
DOI: https://doi.org/10.1109/tmm.2023.3277736
IF: 7.3
2023-01-01
IEEE Transactions on Multimedia
Abstract:3D scene graph prediction is important for intelligent agents to gather information and perceive semantics of their environments. However, constructing an effective graph is nontrivial given the complexity of natural scenes. Existing solutions for graph representation of 3D scenes still distinguish each detailed discrepancy among all the relationships as flat thinking, ignoring the mechanism used by humans to perform this task. Inspired by the role of the prefrontal cortex in hierarchical reasoning, we analyze this problem from a novel perspective: exploring hierarchical spatial layout cues in 3D space and navigating that hierarchy to make the 3D scene graph more accurate in a vertical division to horizontal propagation strategy. To this end, we first encode the contextual object features for fine-gained object category classification. Next, we build a bottom-up hierarchical graph to predict remarkably diverse support relationships in a single concept regardless of numerous irrelevant relationships. Finally, equipped with the spatially-true and semantically-meaningful support relationships, we focus on the local region layout to propagate the semantic features to predict the additional non-support relationships under the guidance of the given referred hierarchical graph nodes. Experiments on the challenging 3DSSG benchmark show that our algorithm outperforms existing state-of-the-art, and can also alleviate the impact of the long-tailed distribution of training data. Our code is available at https://github.com/HHrEtvP/HSLC-3DSG/.
computer science, information systems,telecommunications, software engineering
What problem does this paper attempt to address?