Scene Graph Masked Variational Autoencoders for 3D Scene Generation

Rui Xu,Le Hui,Yuehui Han,Jianjun Qian,Jin Xie
DOI: https://doi.org/10.1145/3581783.3612262
2023-01-01
Abstract:Generating realistic 3D indoor scenes requires a deep understanding of objects and their spatial relationships. However, existing methods often fail to generate realistic 3D scenes due to the limited understanding of object relationships. To tackle this problem, we propose a Scene Graph Masked Variational Auto-Encoder (SG-MVAE) framework that fully captures the relationships between objects to generate more realistic 3D scenes. Specifically, we first introduce a relationship completion module that adaptively learns the missing relationships between objects in the scene graph. To accurately predict the missing relationships, we employ multi-group attention to capture the correlations between the objects with missing relationships and other objects in the scene. After obtaining the complete scene relationships, we mask the relationships between objects and use a decoder to reconstruct the scene. The reconstruction process enhances the model's understanding of relationships, generating more realistic scenes. Extensive experiments on benchmark datasets show that our model outperforms state-of-the-art methods.
What problem does this paper attempt to address?