Automatic Camera Trajectory Control with Enhanced Immersion for Virtual Cinematography

Xinyi Wu,Haohong Wang,Aggelos K. Katsaggelos
2024-05-22
Abstract:User-generated cinematic creations are gaining popularity as our daily entertainment, yet it is a challenge to master cinematography for producing immersive contents. Many existing automatic methods focus on roughly controlling predefined shot types or movement patterns, which struggle to engage viewers with the circumstances of the actor. Real-world cinematographic rules show that directors can create immersion by comprehensively synchronizing the camera with the actor. Inspired by this strategy, we propose a deep camera control framework that enables actor-camera synchronization in three aspects, considering frame aesthetics, spatial action, and emotional status in the 3D virtual stage. Following rule-of-thirds, our framework first modifies the initial camera placement to position the actor aesthetically. This adjustment is facilitated by a self-supervised adjustor that analyzes frame composition via camera projection. We then design a GAN model that can adversarially synthesize fine-grained camera movement based on the physical action and psychological state of the actor, using an encoder-decoder generator to map kinematics and emotional variables into camera trajectories. Moreover, we incorporate a regularizer to align the generated stylistic variances with specific emotional categories and intensities. The experimental results show that our proposed method yields immersive cinematic videos of high quality, both quantitatively and qualitatively. Live examples can be found in the supplementary video.
Multimedia,Graphics,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the issue of enhancing immersion in virtual filmmaking through automated camera trajectory control. Specifically, the researchers found that users often face challenges in generating highly immersive content due to a lack of professional filmmaking knowledge. Although many automated methods can roughly control predefined shot types or movement patterns, these methods struggle to resonate with the audience in the context of the actors' situations. Therefore, this study proposes a deep camera control framework that achieves coordination between actors and the camera through synchronization in three aspects: considering visual aesthetics, spatial actions, and emotional states. This framework not only optimizes shot composition but also utilizes a Generative Adversarial Network (GAN) model to generate fine camera movements based on the actors' actions and psychological states, thereby significantly enhancing the audience's immersive experience. Experimental results show that this method can generate high-quality immersive film videos both quantitatively and qualitatively.