Physics-based Scene Layout Generation from Human Motion

Jianan Li,Tao Huang,Qingxu Zhu,Tien-Tsin Wong
DOI: https://doi.org/10.1145/3641519.3657517
2024-05-21
Abstract:Creating scenes for captured motions that achieve realistic human-scene interaction is crucial for 3D animation in movies or video games. As character motion is often captured in a blue-screened studio without real furniture or objects in place, there may be a discrepancy between the planned motion and the captured one. This gives rise to the need for automatic scene layout generation to relieve the burdens of selecting and positioning furniture and objects. Previous approaches cannot avoid artifacts like penetration and floating due to the lack of physical constraints. Furthermore, some heavily rely on specific data to learn the contact affordances, restricting the generalization ability to different motions. In this work, we present a physics-based approach that simultaneously optimizes a scene layout generator and simulates a moving human in a physics simulator. To attain plausible and realistic interaction motions, our method explicitly introduces physical constraints. To automatically recover and generate the scene layout, we minimize the motion tracking errors to identify the objects that can afford interaction. We use reinforcement learning to perform a dual-optimization of both the character motion imitation controller and the scene layout generator. To facilitate the optimization, we reshape the tracking rewards and devise pose prior guidance obtained from our estimated pseudo-contact labels. We evaluate our method using motions from SAMP and PROX, and demonstrate physically plausible scene layout reconstruction compared with the previous kinematics-based method.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
This paper proposes a solution to the problem of how to generate physically plausible scene layouts from human motion. Creating scenes with realistic interactions between humans and the environment is a challenge in 3D animation or video games, as character motions are often captured in a green screen studio without actual furniture or objects. This can lead to mismatches between planned actions and captured motions, requiring automated scene layout generation to alleviate the burden of selecting and positioning furniture. Previous methods were unable to avoid distortions such as penetration and floating due to a lack of physical constraints. Moreover, some methods rely on specific data to learn contact possibilities, limiting their generalization ability to different motions. The paper proposes a physics-based approach that simultaneously optimizes the scene layout generator and the human moving in a physics simulator. By introducing physical constraints, this approach achieves realistic and plausible interaction motions by minimizing motion tracking errors to identify objects that can support interactions. They employ reinforcement learning for joint optimization of both the human motion imitation controller and the scene layout generator. To facilitate the optimization, they reshape the tracking reward and utilize estimated pseudo-contact labels for pose prior guidance. This approach is evaluated on the SAMP and PROX datasets, demonstrating physically plausible scene layout reconstruction compared to the dynamics-based methods. Overall, the main contribution of the paper is a framework called INFERACT, which is capable of inferring interaction objects and learning human-environment interaction motions simultaneously based on human motion, resulting in physically compliant and realistic scene layouts.