Birth and Death of a Rose

Chen Geng,Yunzhi Zhang,Shangzhe Wu,Jiajun Wu
2024-12-07
Abstract:We study the problem of generating temporal object intrinsics -- temporally evolving sequences of object geometry, reflectance, and texture, such as a blooming rose -- from pre-trained 2D foundation models. Unlike conventional 3D modeling and animation techniques that require extensive manual effort and expertise, we introduce a method that generates such assets with signals distilled from pre-trained 2D diffusion models. To ensure the temporal consistency of object intrinsics, we propose Neural Templates for temporal-state-guided distillation, derived automatically from image features from self-supervised learning. Our method can generate high-quality temporal object intrinsics for several natural phenomena and enable the sampling and controllable rendering of these dynamic objects from any viewpoint, under any environmental lighting conditions, at any time of their lifespan. Project website: <a class="link-external link-https" href="https://chen-geng.com/rose4d" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to generate temporal object intrinsics from pre - trained 2D base models, that is, the sequences of the geometry, reflectivity and texture of objects changing over time in natural processes. For example, the process of a rose gradually opening from a bud to full bloom and then withering. Traditional methods require a great deal of manual effort and expertise to create realistic temporal - evolving object intrinsics, while this paper proposes a learning - based method to generate these physically reasonable graphic assets without any intervention. Specifically, the paper proposes the following innovations: 1. **Neural Templates**: To ensure the consistency of temporal object intrinsics, the author introduces the concept of neural templates, which is a method of automatically extracting image features from self - supervised learning and is used to guide the distillation process in the temporal state. 2. **High - Fidelity 4D Representation**: The paper proposes a hybrid 4D representation method, combining K - Planes and Neural Graphical Primitives (NGP), to generate high - quality temporal object intrinsics. 3. **Physically - Based Rendering**: By using Physically - Based Rendering (PBR) technology, the paper can restore the photorealistic texture of objects and use a differentiable PBR renderer in the distillation process. 4. **4D Distillation Framework**: The paper proposes an optimization framework to generate 4D content by distilling signals from pre - trained 2D diffusion models, while using the Neural State Map as a conditional control signal. Overall, this paper aims to solve the problem of how to efficiently generate temporal object intrinsics from 2D base models, thereby achieving high - quality, controllable rendering of dynamic objects.