D-Flow: Differentiating through Flows for Controlled Generation

Heli Ben-Hamu,Omri Puny,Itai Gat,Brian Karrer,Uriel Singer,Yaron Lipman
2024-07-21
Abstract:Taming the generation outcome of state of the art Diffusion and Flow-Matching (FM) models without having to re-train a task-specific model unlocks a powerful tool for solving inverse problems, conditional generation, and controlled generation in general. In this work we introduce D-Flow, a simple framework for controlling the generation process by differentiating through the flow, optimizing for the source (noise) point. We motivate this framework by our key observation stating that for Diffusion/FM models trained with Gaussian probability paths, differentiating through the generation process projects gradient on the data manifold, implicitly injecting the prior into the optimization process. We validate our framework on linear and non-linear controlled generation problems including: image and audio inverse problems and conditional molecule generation reaching state of the art performance across all.
Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to address the issue of controlled generation in generative models (especially diffusion models and flow matching models) without retraining specific task models. Specifically, the paper proposes a framework named **D-Flow** that optimizes the generation results by differentiating through the generative process. This method can be applied to various controlled generation tasks, including inverse problems in image and audio, as well as conditional molecule generation, achieving state-of-the-art performance in all applications. ### Main Contributions: 1. **Proposed Framework**: The paper introduces a framework based on differentiating through the generative process for controlled generation using pre-trained diffusion or flow matching models. 2. **Implicit Regularization**: It demonstrates that the optimization method through differentiating the generative process can inject an implicit prior, which helps project the gradients onto the data manifold. 3. **Wide Applicability**: Experiments show that this method performs excellently across multiple tasks in different domains, including image inverse problems, text-to-image generation, and molecule generation.