Arbitrary 3D stylization of radiance fields
Sijia Zhang,Ting Liu,Zhuoyuan Li,Yi Sun
DOI: https://doi.org/10.1016/j.imavis.2024.104971
IF: 3.86
2024-05-01
Image and Vision Computing
Abstract:3D Stylization that creates stylized multi-view images is quite challenging, as it requires not only generating images which align with the desired style but also maintaining consistency across different perspectives. Most previous image style transfer methods focus on the 2D image domain and stylize each view independently, suffering from multi-view inconsistency. To tackle this challenging problem, we build on the neural radiance fields (NeRF) to stylize each 3D scene, as NeRF inherently ensures consistency across multiple perspectives, and has two sub-networks of geometry and appearance where appearance stylization cannot change the geometry. To enable arbitrary style transfer and more explicit and precise style adjustment, we introduce the CLIP model, which allows for style transfer based on either a text prompt or an arbitrary style image. We employ an ensemble of loss functions, of which CLIP loss ensures the similarity between the shared latent embeddings and generated style images, and Mask Loss is to constrain the 3D geometry to avoid non-smooth surface of NeRF. Experimental results demonstrate the effectiveness of our arbitrary 3D stylization generalized across diverse datasets. The proposed method outperforms most image-based and text-based 3D stylization models in terms of style transfer quality, producing pleasing images.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, software engineering,optics