Curved Diffusion: A Generative Model With Optical Geometry Control

Andrey Voynov,Amir Hertz,Moab Arar,Shlomi Fruchter,Daniel Cohen-Or
2024-07-15
Abstract:State-of-the-art diffusion models can generate highly realistic images based on various conditioning like text, segmentation, and depth. However, an essential aspect often overlooked is the specific camera geometry used during image capture. The influence of different optical systems on the final scene appearance is frequently overlooked. This study introduces a framework that intimately integrates a text-to-image diffusion model with the particular lens geometry used in image rendering. Our method is based on a per-pixel coordinate conditioning method, enabling the control over the rendering geometry. Notably, we demonstrate the manipulation of curvature properties, achieving diverse visual effects, such as fish-eye, panoramic views, and spherical texturing using a single diffusion model.
Computer Vision and Pattern Recognition,Graphics,Machine Learning
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve The paper aims to address the issue of integrating specific camera geometric controls into text-to-image diffusion models. Specifically, current state-of-the-art diffusion models can generate highly realistic images based on various conditions (such as text, segmentation, and depth), but these models often overlook the impact of specific camera geometric characteristics used during the shooting process. The paper proposes a framework that closely integrates text-to-image diffusion models with specific lens geometric characteristics used in image rendering. Through a pixel-level coordinate conditioning method, this approach can control rendering geometric characteristics and demonstrate manipulation of curvature features, thereby enabling a single diffusion model to achieve various visual effects such as fisheye, panoramic views, and spherical textures. In summary, the main goal of this research is to enable the model to generate images under arbitrary curved lens and camera geometric characteristics without compromising the quality of existing models, thereby meeting the needs of professional photographers for different lens effects.