Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing

Ri-Zhao Qiu,Ge Yang,Weijia Zeng,Xiaolong Wang
2024-04-02
Abstract:Scene representations using 3D Gaussian primitives have produced excellent results in modeling the appearance of static and dynamic 3D scenes. Many graphics applications, however, demand the ability to manipulate both the appearance and the physical properties of objects. We introduce Feature Splatting, an approach that unifies physics-based dynamic scene synthesis with rich semantics from vision language foundation models that are grounded by natural language. Our first contribution is a way to distill high-quality, object-centric vision-language features into 3D Gaussians, that enables semi-automatic scene decomposition using text queries. Our second contribution is a way to synthesize physics-based dynamics from an otherwise static scene using a particle-based simulator, in which material properties are assigned automatically via text queries. We ablate key techniques used in this pipeline, to illustrate the challenge and opportunities in using feature-carrying 3D Gaussians as a unified format for appearance, geometry, material properties and semantics grounded on natural language. Project website:
Computer Vision and Pattern Recognition,Artificial Intelligence,Graphics,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to semi - automatically synthesize dynamic scenes from static 3D scenes, and simultaneously manipulate the appearance and physical properties of objects in the scenes. Specifically, the authors propose a method named "Feature Splatting". This method achieves this goal by enhancing the existing point - rendering techniques, using 3D Gaussian primitives as geometric primitives, and combining additional view - invariant features from visual and vision - language foundation models. In addition, the paper also explores how to automatically assign material properties through natural - language queries, thereby synthesizing physics - based dynamic simulations on the basis of static scenes. This method not only supports rasterizing high - quality images but also enables physics - based dynamic - scene synthesis, unifying photorealism, rich semantics, and physical - dynamics synthesis. The main contributions of the paper include: 1. **Feature - rendering method**: Proposed feature - rendering (Feature Splatting) for enhancing the semantics of static scenes and language - based realistic physical motion. 2. **Algorithmic and technical challenges**: Solved algorithmic and system - level challenges, including a material - point - method (MPM) - based physical engine adapted to Gaussian representation, and a novel method for fusing the features of multiple basic visual 2D models to achieve accurate decomposition. 3. **Editing tool**: Demonstrated feature - rendering as an excellent editing tool that can achieve language - based scene editing, such as basic editing operations like object removal, scaling, rotation, translation, and cloning. Through these contributions, the paper provides a powerful framework that can achieve rich semantic understanding of 3D scenes and physical - dynamics simulation while maintaining high - quality rendering.