Abstract:Social VR platforms enable social, economic, and creative activities by allowing users to create and share their own virtual spaces. In social VR, photography within a VR scene is an important indicator of visitors' activities. Although automatic identification of photo spots within a VR scene can facilitate the process of creating a VR scene and enhance the visitor experience, there are challenges in quantitatively evaluating photos taken in the VR scene and efficiently exploring the large VR scene. We propose PanoTree, an automated photo-spot explorer in VR scenes. To assess the aesthetics of images captured in VR scenes, a deep scoring network is trained on a large dataset of photos collected by a social VR platform to determine whether humans are likely to take similar photos. Furthermore, we propose a Hierarchical Optimistic Optimization (HOO)-based search algorithm to efficiently explore 3D VR spaces with the reward from the scoring network. Our user study shows that the scoring network achieves human-level performance in distinguishing randomly taken images from those taken by humans. In addition, we show applications using the explored photo spots, such as automatic thumbnail generation, support for VR world creation, and visitor flow planning within a VR scene.

What problem does this paper attempt to address?

This paper attempts to solve the problem of automatically identifying and exploring the best photo - taking points in virtual reality (VR) scenes. Specifically, it faces two main challenges: 1. **Quantitative evaluation of photographed photos**: It is necessary to develop a method to evaluate whether the images captured in VR scenes are likely to be taken by humans. This involves the quantitative evaluation of the aesthetic quality of images. 2. **Efficient exploration of the entire VR scene**: Since the space of VR scenes is very large and the position and orientation of the camera have 6 degrees of freedom (DoF), how to efficiently search these spaces to find the best photo - taking points is a computationally intensive task. To solve these problems, the paper proposes PanoTree, an automated photo - taking point exploration system based on deep learning and the Hierarchical Optimistic Optimization (HOO) algorithm. The following is a summary of the specific content: ### Problem background - **Social VR platforms**: Social VR platforms allow users to create and share their own virtual spaces, and photography has become an important indicator for measuring visitor activities. - **Challenges**: Automatically identifying photo - taking points in VR scenes can simplify the VR scene creation process and enhance the visitor experience, but there are challenges in how to quantitatively evaluate the quality of photos in VR scenes and efficiently explore large - scale VR scenes. ### Solutions #### 1. Scoring Network - **Training data**: It is trained using a large - scale photo dataset collected from the social VR platform Cluster. - **Function**: This network can predict whether the input image is likely to be taken by a human, thereby evaluating the aesthetic quality of the image. #### 2. Hierarchical Optimistic Optimization (HOO) algorithm - **Space division**: The VR scene is hierarchically divided into multiple sub - regions to gradually narrow the search range. - **Exploration strategy**: Using the score output by the scoring network as a reward signal, the photo - taking points in 3D space are efficiently explored through the method of the multi - armed bandit problem. ### Main contributions - **Proposing PanoTree**: A system that can autonomously explore photo - taking points in VR scenes. - **Developing a scoring network**: A deep neural network trained based on a large - scale real - world photo dataset for evaluating the aesthetic quality of photos in VR scenes. - **Efficient hierarchical search algorithm**: Efficient spatial hierarchical search for photo - taking point exploration is achieved through implicit supervision. - **Prototype implementation and evaluation**: A prototype of PanoTree search is implemented, the exploration quality is discussed, and potential application scenarios are provided. ### Formula explanation Some key formulas involved in the description are as follows: - **Scoring network output**: \[ f_S: \mathbb{R}^{H \times W \times C} \rightarrow [0, 1] \] where \( H \) and \( W \) represent the height and width of the image respectively, and \( C \) represents the number of channels. - **Upper Confidence Bound (UCB)**: \[ U_{d,i}(n) = \hat{\mu}_{d,i}(n) + c_s \sqrt{\frac{2 \log N}{T_{d,i}(n)}} + \nu_1 \rho^d \] where \( c_s > 0\), \(\nu_1 > 0\), \(0 < \rho \leq 1\) are hyperparameters. - **B - value update**: \[ B_{d,i}(n) = \min \left\{ U_{d,i}(n), \max \left\{ B_{d + 1,2i}(n), B_{d + 1,2i + 1}(n) \right\} \right\} \] These formulas ensure the effectiveness and accuracy of the scoring network and the exploration algorithm.

PanoTree: Autonomous Photo-Spot Explorer in Virtual Reality Scenes

Unifying Terrain Awareness Through Real-Time Semantic Segmentation

DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization

Picture Based Virtual Touring

PanoSSC: Exploring Monocular Panoptic 3D Scene Reconstruction for Autonomous Driving

Dream360: Diverse and Immersive Outdoor Virtual Scene Creation via Transformer-Based 360 Image Outpainting

SceneViewer: Automating Residential Photography in Virtual Environments

An Exploration Tool for Retrieval of Travel Information with Personal Photos

Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured Objects

PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation

Neural Free-Viewpoint Performance Rendering under Complex Human-object Interactions

Kairos: Exploring a Virtual Botanical Garden through Point Clouds

Photo tourism: exploring photo collections in 3D

Visual Design and Evaluation of Public Space Scene Based on Virtual Reality Technology

Refocusing Supports of Panorama Light-Field Images in Head-Mounted Virtual Reality

ARPOV: Expanding Visualization of Object Detection in AR with Panoramic Mosaic Stitching

Finding Waldo: Towards Efficient Exploration of NeRF Scene Spaces

PanoRecon: Real-Time Panoptic 3D Reconstruction from Monocular Video

A virtual reality platform for dynamic human-scene interaction

HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions

Multiform Art Design Strategy of VR Technology Based on Visual Design Perspective