VRCopilot: Authoring 3D Layouts with Generative AI Models in VR

Lei Zhang,Jin Pan,Jacob Gettig,Steve Oney,Anhong Guo
DOI: https://doi.org/10.1145/3654777.3676451
2024-08-18
Abstract:Immersive authoring provides an intuitive medium for users to create 3D scenes via direct manipulation in Virtual Reality (VR). Recent advances in generative AI have enabled the automatic creation of realistic 3D layouts. However, it is unclear how capabilities of generative AI can be used in immersive authoring to support fluid interactions, user agency, and creativity. We introduce VRCopilot, a mixed-initiative system that integrates pre-trained generative AI models into immersive authoring to facilitate human-AI co-creation in VR. VRCopilot presents multimodal interactions to support rapid prototyping and iterations with AI, and intermediate representations such as wireframes to augment user controllability over the created content. Through a series of user studies, we evaluated the potential and challenges in manual, scaffolded, and automatic creation in immersive authoring. We found that scaffolded creation using wireframes enhanced the user agency compared to automatic creation. We also found that manual creation via multimodal specification offers the highest sense of creativity and agency.
Human-Computer Interaction,Artificial Intelligence,Emerging Technologies
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to effectively integrate generative AI models into the immersive authoring workflow to support users in rapid prototyping and iteration of 3D scenes in the virtual reality (VR) environment, and enhance users' sense of control, creativity, and participation. Specifically, the paper explores the following aspects: 1. **Integration and Interaction of Generative AI Models**: - How can users interact with generative AI models through multi - modal inputs such as voice commands and gestures? - How to provide a transparent and controllable way so that users can understand and adjust AI - generated content? 2. **User Agency and Creativity**: - What are the differences in users' sense of control and creativity in the three modes of automatic creation, guided creation, and manual creation? - Can generative AI help users save time and explore more design possibilities while maintaining or enhancing users' creative freedom? 3. **Application of Intermediate Representations**: - Introduce wireframes as an intermediate representation to help users more intuitively define and adjust 3D layouts, thereby enhancing users' control over generated content. ### Main Contributions of the Paper 1. **VRCopilot System**: - Propose a mixed - initiative immersive authoring system that allows users to create 3D layouts by collaborating with pre - trained generative AI models. - Support multiple interaction methods, including multimodal specification and intermediate representations (such as wireframes), to enhance users' control and understanding of generated content. 2. **User Research Results**: - Through two rounds of user research, evaluate the user experience in different creation modes (manual creation, automatic creation, and guided creation), especially the perceived sense of control and creativity. - It is found that guided creation (using wireframes) enhances users' sense of control more than automatic creation, while manual creation provides the highest sense of control and creativity. ### Conclusion This paper solves the problem of applying generative AI models in immersive authoring by introducing the VRCopilot system, especially how to achieve efficient, transparent, and controllable human - machine collaboration in the VR environment. Research shows that, through appropriate interaction design and technical support, generative AI can significantly improve the efficiency and diversity of 3D scene creation without sacrificing users' sense of control and creativity.